WO2025093970A1 - Systèmes et procédés de gestion de salle opératoire - Google Patents
Systèmes et procédés de gestion de salle opératoire Download PDFInfo
- Publication number
- WO2025093970A1 WO2025093970A1 PCT/IB2024/059933 IB2024059933W WO2025093970A1 WO 2025093970 A1 WO2025093970 A1 WO 2025093970A1 IB 2024059933 W IB2024059933 W IB 2024059933W WO 2025093970 A1 WO2025093970 A1 WO 2025093970A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- operating room
- audio
- machine learning
- visual
- input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H40/00—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
- G16H40/20—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H20/00—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
- G16H20/40—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to mechanical, radiation or invasive therapies, e.g. surgery, laser therapy, dialysis or acupuncture
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H40/00—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
- G16H40/40—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management of medical equipment or devices, e.g. scheduling maintenance or upgrades
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H40/00—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
- G16H40/60—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices
- G16H40/63—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices for local operation
Definitions
- the disclosure relates to the digital health field.
- the disclosure is directed to systems with software and hardware components, for example, hardware components in the operating room and hospital, and software that can run on computers and/or mobile devices, for example, in the cloud.
- the systems and methods described herein have software and machine learning algorithm components.
- ORs Operating Rooms
- OR management is often done in a sub-optimal way. This is reflected by low total OR utilization (e.g., measured by percentage of effective OR use over predefined period of time), long turnover time between the cases (i.e. time to clean-up the room, prepare it and patient for the next case and synchronization between these activities), and low percentage of on-time surgery starts (including day's first surgery).
- HCWs Healthcare Workers
- OR staff is responsible for maintaining aseptic technique. Particularly, scrub nurse is in charge of identifying any events that could lead to a loss of sterility (e.g., touching or approaching non-sterile surfaces, accidental drops etc.). Taking into account amount of scrub tasks, and restrained observation capability (e.g., field of visibility), complete observation is not possible. This might contribute to increase in total amount of SSI (Surgical Site Infections), which stays high even in developed countries (up to 11% for certain procedures).
- SSI Surgical Site Infections
- the systems and methods described herein assist an operating room (OR) team during all surgery stages, assist with OR management, and/or enable surgeon- and surgeryspecific trainings, analytics and intelligence for administrators and provides optimization algorithms to decrease OR costs.
- the systems and methods described herein comprise an operating room (OR) hardware module with cameras and sensors, processing stations, cloud, terminals and associated machine learning (ML) software.
- the present disclosure is directed to a system for management of an operating room, the system comprising: one or more cameras for receiving visual input from the operating room; optionally, one or more sensors for receiving additional input from the operating room, the one or more sensors based on one or more of the following: Wi-Fi, Bluetooth, ZigBee, Thread, radio frequency identification (RFID), ultra-wide band (UWB), inertial measurement unit (IMU), visible light communication (VLC), infrared (IR), ultrasonic, geomagnetic, light detection and ranging (LiDAR), and computer vision; optionally, one or more audio input/output (I/O) devices (e.g., microphones and/or speakers) for receiving audio input from the operating room and/or for producing audio output to one or more users; one or more displays (e.g., screen, holographic display, or other display) (e.g., a display of an augmented reality (AR), virtual reality (VR), and/or mixed reality (MR) device) for augmented reality (AR), augmented
- the present disclosure is directed to a system for management of an operating room, the system comprising: one or more input devices, the one or more input devices comprising (i) one or more cameras for receiving visual input from the operating room, (ii) one or more sensors for receiving additional input from the operating room, and/or (iii) one or more audio input/output (I/O) devices for receiving audio input from the operating room (e.g., and also for producing audio output to one or more users); one or more output devices, the one or more output devices comprising: one or more displays for presenting visual output data, (ii) one or more projectors for projecting directionally-controlled light onto one or more objects in the operating room, and/or (iii) one or more audio input/output (I/O) devices for producing audio output (e.g., and optionally also for receiving audio input from the operating room) to one or more users; a processor of a computing device for receiving and processing input data (e.g., the visual input received from the visual input received from the visual input
- the system further comprises a projector for projecting directionally-controlled light (e.g., laser light, a spotlight, or other light) onto one or more objects in the operating room, wherein the instructions, when executed by the processor, cause the processor to use at least a portion of the input data received from the operating room to spatially locate said one or more objects and cause the projector to illuminate said one or more objects as a visual aid for the one or more users in the operating room.
- directionally-controlled light e.g., laser light, a spotlight, or other light
- the projector is a physical (real world) projector for projecting directionally-controlled light onto the one or more (physical) objects in real space.
- the projector is a virtual projector (e.g., software) for projecting directionally-controlled light via an augmented reality (AR) display onto the one or more objects in the operating room as viewed by a user wearing an AR headset, where the projection of light is “onto” the object in the figurative sense since that is how said projection appears to the user wearing the headset.
- AR augmented reality
- the instructions when executed by the processor, cause the processor to use at least a portion of the input data received from the operating room to produce visual and/or audio output data via one or more machine learning modules in order to perform one or more functions, said output data rendered for presentation to at least one of the one or more users on the one or more displays in service of said one or more functions, wherein said one or more functions comprises at least one member selected from the group consisting of: person identification (e.g., identify who is who in the operating room); stage and/or task identification (e.g., identify a current stage of a surgical procedure and/or identify what stage will follow or will likely follow the current stage); predictive analytics (e.g., based on past experiences, predict a likely next action and/or a next goal); instrument identification (e.g., identify which instruments are used and/or prepared and/or handled); speech recognition and synthesis (e.g., interpret conversation and produce speech response); human emotion recognition (e.g., recognize emotions of one or more users in the operating
- person identification e.g.
- the one or more functions comprises instrument identification
- the output data rendered for presentation to at least one of the one or more users comprises a graphical overlay (e.g., bounding box) rendered in relation to an identified instrument (e.g., as viewed via an AR or VR display) to graphically annotate (e.g., outline, highlight, denote with text, and/or graphically present a determined probability said identification is accurate) the identified instrument to the at least one of the one or more users via one or more of the one or more displays.
- a graphical overlay e.g., bounding box
- an identified instrument e.g., as viewed via an AR or VR display
- graphically annotate e.g., outline, highlight, denote with text, and/or graphically present a determined probability said identification is accurate
- the instrument identification is part of a process performed by a system algorithm, e.g., one or more of the use cases described herein, e.g., scrub assistance (e.g., Scrub Nurse), training, and/or instrument tray optimization.
- a system algorithm e.g., one or more of the use cases described herein, e.g., scrub assistance (e.g., Scrub Nurse), training, and/or instrument tray optimization.
- the one or more functions comprises context segmentation
- the output data rendered for presentation to at least one of the one or more users comprises a graphical warning presented via the display or an audible warning presented via an audio output device, said warning indicating a potential loss of sterility (e.g., due to touching of a non-sterile instrument by an individual).
- context segmentation comprises identifying and/or classifying surfaces (e.g., as sterile or non-sterile) and objects (e.g., as mobile or fixed), and the one or more functions comprises performing context segmentation as part of an event detection (e.g., detection of potential loss of sterility).
- the one or more functions comprises stage and/or task identification, wherein the processor determines (i) an identification of a current stage of a surgical procedure and/or (ii) an identification of what stage will follow or will likely follow the current stage, and wherein the output data rendered for presentation to at least one of the one or more users comprises illumination by the projector of one or more instruments that are likely to be used in the current stage and/or the next stage of the surgical procedure.
- the present disclosure is directed to a method for management of an operating room, the method comprising: receiving, by a processor of a computing device, input data from the operating room and producing visual and/or audio output data via one or more machine learning modules, said output data rendered for presentation to a user on a display, wherein the input data comprises one or more of (i)-(iii) as follows: (i) visual input from the operating room detected by one or more cameras; (ii) input from one or more sensors selected from the group consisting of the following: Wi-Fi, Bluetooth, ZigBee, Thread, radio frequency identification (RFID), ultra-wide band (UWB), inertial measurement unit (IMU), visible light communication (VLC), infrared (IR), ultrasonic, geomagnetic, light detection and ranging (LiDAR), and computer vision; and (iii) audio input from one or more audio input/output (I/O) devices in the operating room.
- RFID radio frequency identification
- UWB ultra-wide band
- IMU iner
- the present disclosure is directed to a method for management of an operating room, the method comprising: receiving, by a processor of a computing device, input data from the operating room and producing visual and/or audio output data via one or more machine learning modules (e.g., said output data rendered for presentation to a user on a display), wherein the input data comprises one or more of (i)-(iii) as follows: (i) visual input from the operating room detected by one or more cameras; (ii) input from one or more sensors (e.g., selected from the group consisting of the following: Wi-Fi, Bluetooth, Thread, ZigBee, radio frequency identification (RFID), ultra-wide band (UWB), inertial measurement unit (IMU), visible light communication (VLC), infrared (IR), ultrasonic, geomagnetic, light detection and ranging (LiDAR), and computer vision); and (iii) audio input from one or more audio input/output (I/O) (e.g., microphone (s)) devices in
- the method comprises producing visual and/or audio output data via the one or more machine learning modules in order to perform one or more functions selected from the group consisting of: person identification (e.g., identify who is who in the operating room); stage and/or task identification (e.g., identify a current stage of a surgical procedure and/or identify what stage will follow or will likely follow the current stage); predictive analytics (e.g., based on past experiences, predict a likely next action and/or a next goal); instrument identification (e.g., identify which instruments are used and/or prepared and/or handled); speech recognition and synthesis (e.g., interpret conversation and produce speech response); human emotion recognition (e.g., recognize emotions of one or more users in the operating room); event detection (e.g., detect selected events in the operating room); spatial localization (e.g., spatially localize person(s) and/or equipment in the operating room); context segmentation (e.g., identify and/or tag people and/or equipment in the operating room,
- person identification e.g.
- the one or more functions comprises instrument identification
- the output data comprises a graphical overlay (e.g., bounding box) rendered in relation to an identified instrument (e.g., as viewed via an AR, VR, or MR display) to graphically annotate (e.g., outline, highlight, denote with text, and/or graphically present a determined probability said identification is accurate) the identified instrument to the at least one of the one or more users via one or more displays.
- a graphical overlay e.g., bounding box
- an identified instrument e.g., as viewed via an AR, VR, or MR display
- graphically annotate e.g., outline, highlight, denote with text, and/or graphically present a determined probability said identification is accurate
- the one or more functions comprises context segmentation
- the output data comprises a graphical warning presented via a display or an audible warning presented via an audio output device, said warning indicating a potential loss of sterility (e.g., due to touching of a non-sterile instrument by an individual).
- context segmentation comprises identifying and/or classifying surfaces (e.g., as sterile or non-sterile) and objects (e.g., as mobile or fixed)
- the one or more functions comprises performing context segmentation as part of an event detection (e.g., detection of potential loss of sterility).
- the one or more functions comprises (i) identifying, by the processor of the computing device, a current stage of a surgical procedure and/or (ii) identifying, by the processor of the computing device, what stage will follow or will likely follow the current stage, and wherein the output data comprises illumination by the projector of one or more instruments that are likely to be used in the current stage and/or the next stage of the surgical procedure.
- FIG. 1 is a schematic diagram of system operating room (OR) elements and setup, according to some illustrative embodiments.
- FIG. 2 is a screenshot of a graphical user interface displaying information shown on the OR Screen, according to some illustrative embodiments.
- FIG. 3 is a schematic diagram depicting a computing network and computing devices integrated with the system, according to some illustrative embodiments.
- FIG. 4 is a schematic diagram depicting example stages and tasks performed by the system, according to some illustrative embodiments.
- FIG. 5 depicts an automatic instrument recognition task performed by the system, according to some illustrative embodiments.
- FIG. 6 is a screenshot of a graphical user interface showing an example warning linked to potential sterility loss due to touching of non-sterile surface (here, a non-protected OR light), according to some illustrative embodiments.
- FIG. 7 depicts an automatic spatial localization task using bounding boxes, performed by the system, according to some illustrative embodiments. Note several regions corresponding to the same object (e.g., the Surgeon) and localization of surgical instruments (here, scissors).
- FIG. 8 depicts an automatic context segmentation performed by the system, according to some illustrative embodiments.
- FIG. 9 is a screenshot of a graphical user interface showing an example of the instrument learning function, according to some illustrative embodiments.
- FIG. 10 is a screenshot of a graphical user interface showing an example instrument table layouts learning function, according to some illustrative embodiments.
- FIG. 11 is a screenshot of a graphical user interface showing an example for training of instrumentation-task assignment, according to some illustrative embodiments.
- FIG. 12 is a screenshot of a graphical user interface showing an example employee dashboard showing OR/Surgery performance, training compliance, and task assessment, according to some illustrative embodiments.
- FIG. 13 is a schematic diagram illustrating Scrub Nurse use cases facilitated by the system, according to some illustrative embodiments.
- FIG. 14 is a screenshot of a graphical user interface showing example assistance shown during surgery, according to some illustrative embodiments.
- FIG. 15 depicts a graphical user interface showing a VR/AR/MR overlay depicting an OR Staff member showing an instrument to the system to request assistance in its identification and/or use, with the automatically determined identification shown in the overlay, along with a probability that said determination is accurate, according to some illustrative embodiments.
- FIG. 16 is a screenshot of a graphical user interface showing assistance that is presented based on a detected instrument, surgery stage, staff exchange, or other trigger, according to some illustrative embodiments.
- FIG. 17 is a schematic diagram illustrating Surgeon use cases facilitated by the system, according to some illustrative embodiments.
- FIG. 18 is a screenshot of a graphical user interface presented on a mobile device, the screenshot showing a follow-up visible on a Surgeon Terminal in a Surgeon use case, according to some illustrative embodiments.
- FIG. 19 is a schematic diagram illustrating Nurse use cases facilitated by the system, according to some illustrative embodiments.
- FIG. 20 is a screenshot of a graphical user interface presented on a mobile device, the screenshot showing a Nurse Request Equipment use case visible on a Nurse Terminal in a Nurse use case, according to some illustrative embodiments.
- FIG. 21 is a schematic diagram illustrating OR Manager use cases facilitated by the system, according to some illustrative embodiments.
- FIG. 22 is a screenshot of a graphical user interface depicting an instrument optimization use case, according to some illustrative embodiments.
- FIG. 23 is a schematic diagram illustrating Administrator use cases facilitated by the system, according to some illustrative embodiments.
- FIGs. 24A-24C are schematic diagrams illustrating components of a VR headset with eye tracking capability, for use in the systems and methods described herein, according to illustrative embodiments of the present disclosure.
- FIG. 25 is a schematic diagram illustrating a network environment for use in providing systems, methods, and architectures as described herein, according to some illustrative embodiments.
- FIG. 26 is a schematic diagram illustrating a computing device and a mobile computing device that can be used in the systems and methods described herein, according to some illustrative embodiments.
- systems, architectures, devices, methods, and processes of the present disclosure encompass variations and adaptations developed using information from the embodiments described herein. Adaptation and/or modification of the systems, architectures, devices, methods, and processes described herein may be performed, as contemplated by this description.
- Headers are provided for the convenience of the reader - the presence and/or placement of a header is not intended to limit the scope of the subject matter described herein.
- the systems and methods described herein can be used in hospitals, ambulatory centers, teaching centers, schools, and/or other institutions or stakeholders in the healthcare system .
- the system has elements shown in FIG. 1.
- FIG. 1 is a schematic drawing of system OR elements and set-up.
- the OR Module includes Visual Input, Audio input/output (I/O), and Projector and Additional Sensors.
- the Processing Station can be present in the OR, connected to OR Screen, and/or can provide a Cloud Link. Additional devices can be integrated with the system, for example, an AR/VR/MR (augmented reality, virtual reality, and/or mixed reality) Headset and smart Instrument Tables and Trays.
- AR/VR/MR augmented reality, virtual reality, and/or mixed reality
- FIG. 1 there are the following users:
- the System has an OR Module that has several elements. These elements might be grouped together on one platform, e.g., suspended under the ceiling, standing on a tripod, cart, extending from the wall, or other arrangements. The elements might be all or partially separated, e.g., cameras separated in OR comers for better visibility, speakers and microphones separated, etc. Each of these elements might have its own stand, suspension, etc. These modules, or single elements might be attached or integrated into existing OR equipment, e.g., lights, bed, tables, and the like. There can be wearable, head-mounted cameras or integrated with other equipment, e.g., AR (Augmented Reality) headset.
- AR Augmented Reality
- Visual Input consists of one or more cameras. These cameras might be assembled in various configuration as described above, fixed, or handheld or wearable, integrated in OR equipment. These cameras might be electronically or manually controlled for their pan/tilt or zoom. Certain or all of the cameras might use wide angle objectives or 360° views. Resolution of the cameras, and their color depth, as well as any additional sensors (e.g., thermography, infrared or ultraviolet vision) can be adapted to the application. For example, the camera might have 4K or 1080P resolution, frame rate of 30 fps, and/or zoom up to 30x.
- the system comprises a Projector.
- the Projector is a device that attracts user attention to one or more items, places or persons in the OR.
- the Projector can be a video projector, one or more controllable in light size and strength and direction spotlights, one or more similarly controlled laser pointing devices, or others.
- Projection is done using wearable means, e.g., AR or VR (Virtual Reality) headset, table rear projection or lightning or integrated screen technology with or without touch capabilities, holographic projection, projection on overlay transparent or semi-transparent materials or glass or others.
- wearable means e.g., AR or VR (Virtual Reality) headset, table rear projection or lightning or integrated screen technology with or without touch capabilities, holographic projection, projection on overlay transparent or semi-transparent materials or glass or others.
- Audio I/O Input/Output
- one or more microphones are placed in one or more places. They can be attached to the OR module, walls, on tripods, integrated in other devices (e.g., AR/VR headset), and/or wearable.
- Microphones can be omnidirectional, directional, capacitor or dynamic, single ones or assembled in arrays, e.g., in order to identify the direction of the sound or help with signal processing (e.g., echo removal, clarity improve, noise subtraction). Audio output produces the sound that can be heard by one or more stakeholders. This can be done with usage of one or more speakers, headsets, or others.
- Speakers can be single units or assembled in arrays to improve characteristics of sound. They can be directional or assembled on a pan/tilt device to direct sound to specific person. They can be headphones. They can provide other functions, for example, noise removal, voice clarity improvement, and/or others.
- Sensors can be used by the system in the operating room. For example, these can be based on Wi-Fi, Bluetooth, ZigBee, Thread, Radio Frequency Identification (RFID), Ultra-Wide Band (UWB), Inertial Measurement Unit (IMU), Visible Uight Communication (VUC), Infrared (IR), Ultrasonic, Geomagnetic, Uight Detection and Ranging (UiDAR), and/or Computer Vision systems.
- the Sensors can be used for various tasks including SUAM (Simultaneous Uocalization and Mapping), segmentation, recognition, additional inputs into machine learning models, and others.
- SUAM Simultaneous Uocalization and Mapping
- segmentation recognition
- additional inputs into machine learning models and others.
- the Processing Station is a computer device that can do a part of data processing, connectivity or run specific computer programs. It may be located directly in the operating room, anywhere else in the hospital or integrated in the Cloud. It might be assigned to an operating room or serve several operating rooms at the same time. It may be directly connected to the OR Module in the operating room, via proxy, using network, wireless or others. For example, it can be cable connected to the OR Screen (e.g., using HDMI connection), connected to the OR Module using network connection (e.g. Ethernet) over which communication with all elements of the OR Module is established. Communication involves obtaining audio and video streams, sending audio streams back together with control signals for pan/tilt/zoom of cameras and controlling the projectors.
- Processing Station might be cable or wireless connected to AR Headsets in the OR. Processing Station might be connected to hospital EAN (Local Area Network) for integration with the hospital digital systems (e.g., PACS Picture Archiving and Communication System, EHR Electronic Health Record System, any hospital management of follow-up system). In certain embodiments, the Processing station is connected to a WAN (Wide Area Network) for connection to the Cloud.
- EAN Local Area Network
- the Processing station is connected to a WAN (Wide Area Network) for connection to the Cloud.
- the OR Screen is used to present information in the OR (see FIG. 2).
- This can be a standard screen attached to the wall, hanging from a ceiling, a screen on a cart, a transparent screen, a virtual screen (e.g., a projection, e.g., a holographic projection), or other.
- a virtual screen e.g., a projection, e.g., a holographic projection
- the OR Screen might be AR/VR headsets.
- FIG. 2 is a screenshot of a graphical user interface displaying information shown on the OR Screen, according to some illustrative embodiments. Note, the selection tab on the to (“Preparation” tab is selected), information canvas covering most of the screen (green area) and dashboard on the right showing key surgery information.
- Instrument Table In the OR, surgical devices, instruments and implants are placed on the Instrument Table.
- This can be a standard table, capable of receiving projection from the projector, a table with rear lighting coming, e.g., from an integrated and controllable screen, a table integrating controlled laser pointing devices or other. It might have a separate one or more cameras for acquiring images from the table or specific OR view.
- Instrument trays can be placed on the separate tables. These can be standard tables, tables with integrated cameras or projection capabilities or others.
- Instruments, devices, implants, medical equipment, liquids, and the like, that are are used in the OR are referred to, herein, collectively, as Instruments.
- FIG. 3 is a schematic diagram depicting a computing network and computing devices integrated with the system, according to some illustrative embodiments.
- computer devices working with the system are shown.
- the Cloud is connected to processing stations, terminals, viewing stations, and data centers.
- the architecture of connection might be different, and not build around the star scheme.
- viewing stations or terminals can be connected to processing station
- hospital network can be connected to the Cloud
- processing station can be integrated in the Cloud, and similarly for data centers, and the like.
- the connection can be done using any network protocol, serial interfaces, wired or wireless, specific connection buses, or others.
- the Cloud refers to servers that are accessed over the Internet, and the software and databases that run on those servers.
- the Cloud runs computer software and algorithms for receiving, analyzing and processing data from the processing station. These can be relatively high level data that requires minimal analysis, or low-level which involves significant amount of processing before being ready for further analysis. For example, data anonymization can be done on the Processing Station before leaving hospital LAN network, or it can be done on the Cloud. Similarly, person tracking and detection can be done on the Processing Station or base data (e.g., images) can be sent to the Cloud for processing and tracking and detection done there. Cloud provides connection to Viewing Stations, Terminals and Data Centers.
- connections can be specific, providing specialized APIs and pre-processed data, and/or generic, requiring connecting devices to further process data and visualization by themselves.
- the Cloud might have a scalable processing and data storage capacities to be able to adapt to System's needs on-the-fly or in a configurable manner.
- Any of the devices can be inside the hospital network, e.g., on hospital LAN, or in an outside network, e.g., WAN, in a highly secure environment (e.g., behind firewalls) or in public networks depending on security, user requirements and usability.
- Network tunnels, VPNs, NAT, VLANs and other technologies can be used for obtaining specific results (e.g., increase in security, speed, reliability and others).
- Viewing Stations enable users to interact with the system, analyze data and configure parameters in an optimized setting.
- this can be a computer station with a screen and internet connection.
- This can be a standard PC, PC with a touchscreen, all-in-one computer.
- Required function can be achieved via web browser and a web app running in the Cloud.
- Alternative Viewing Stations can be mobile devices, tablets, laptops, thin clients, AR and VR systems, or connected wearables. The function can be achieved by dedicated applications connecting directly to various system elements, remote applications or web apps running on other devices or virtual systems.
- Viewing Stations might be user-specific (e.g., OR Manager, Surgeon etc) or generic, configurable, or provided as-is.
- Terminals are devices which enable users to interact with the system, communicate with the system or other users, analyze data, and configure parameters in an optimized setting.
- this can be a mobile device, belonging to the user or provided by the hospital, running an app and providing functionalities.
- Example functionalities could be messages, requests, status updates, checklists, knowledge base access and schedules described in Sections below.
- the Terminal might be a wearable device, e.g., a smartwatch or AR headset or headphones, providing relevant information in most adequate way (text, sound, video) when required.
- Terminals might be user-specific (e.g., OR Manager, Surgeon etc) or generic, configurable, or provided as-is.
- Company Data Centers are computers, servers and other devices that connect to the cloud for further information processing and data storage. These can be data centers used for further product improvements, telemetry, debugging, communication, standard or OTA (Over-The-Air) updates and others. They can obtain raw or preprocessed data (e.g., anonymized) for further analysis, processing, and service. Company Data Centers can interface with the system via Cloud or directly with any possible architecture.
- Hospital Network and Devices are any other devices that are not part of the system but can be important and useful for its function.
- Hospital Network and Devices may include PACS, EHR servers and gateways, imaging devices, robotics, inventory management and ERP systems, CRM systems, HR systems, and/or others.
- the illustrative System uses functions shown in Table 1 below to implement use cases. These functions are described in more detail in the Sections that follow. “X” denotes a link between the indicated use case and function: (1) Assist with devices and instrumentation, (2) Obtain help, (3) Detect events, (4) Communicate, (5) Assist with surgery preparation, (6) Get trained, (7) Get notified, (8) Request equipment, (9) Get analytics, (10) Follow-up, (11) Learn, (12) Simulate load, (13) Manage schedule, (14) Manage team, (15) Manage trainings, and (16) Optimize instrumentation. Table 2 provides a brief description of each function / use case from Table 1.
- Table 1 Example use case-to-function allocation.
- Table 2 Brief Description of use case-to-functions. Personnel assessment Skills, capabilities, history, style and other parameters attached to a system user
- Person identification is an important function used by the System for assistance, event detection, measurements and statistics, context understanding, and others.
- the goal of identification is to uniquely identify a person (e.g., Dr. Dee Fault) and his/her function (Surgeon, Nurse, Scrub, Sales Representative, and others).
- CNN Convolutional Neural Network
- Three-Dimensional Recognition facial recognition method based on comparing a 3D facial scan to the database patterns
- Thermal Cameras Thermal face recognition models are based on the unique temperature patterns of a human face
- o FaceNet face recognition system FaceNet, developed by Google researchers in 2015. is based on face recounition benchmark datasets: o NEC: The solution developed by the Japanese technology company NEC allows accurate identification of people while recognizing age changes; and o Megvii (Face++): coming from Chinese technology company Megvii.
- Training, validation, and test data for the algorithms might come from biometric pictures, labelled pictures from real settings, self-made pictures and recordings, assisted pictures and recordings, and others.
- the algorithms and methods should be reliable enough to work in different lightning condition, facial expressions, surgical masks, hoods and other common OR equipment.
- thermal imaging can enhance recognition reliability.
- fiducials Another way to identify person is by using any type of fiducial, which can be easier identified by a specific technology.
- Example fiducials would be RFID tag, visible codes (e.g., QR codes or bar codes), color-coding of clothes (e.g. surgical/scrub cups), visible tags, optical tracking markers (e.g. reflective spheres), and/or electro -magnetic fiducials (which do not require a direct line of sight with a camera/sensor (e.g., detector)). Additionally, the tags can be used to measure persons position in the operating room.
- a person Another way to identify a person is to deduce his/her role from the context.
- a person spatial location, behavior, tasks, and the like can be used.
- classification There are many methods for classification that can be used for this purpose.
- Convolution Neural Networks or SVM could be used as shallow classifiers.
- deep learning classifiers can be used.
- models of other model architectures that could be used are: Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM) networks, Generative Adversarial Networks (GANs), Deep Belief Networks (DBNs) and others.
- RNNs Recurrent Neural Networks
- LSTM Long Short-Term Memory
- GANs Generative Adversarial Networks
- DBNs Deep Belief Networks
- exemplary ways to identify a person could be based on personal characteristics, e.g., voice, gait, movement pattern, thermovision-based patterns, characteristic elements (glasses, shoes etc.), Bluetooth devices (mobile phone, wearables), and others.
- personal characteristics e.g., voice, gait, movement pattern, thermovision-based patterns, characteristic elements (glasses, shoes etc.), Bluetooth devices (mobile phone, wearables), and others.
- the methods can be used separately, in combination or sequence.
- Results can be combined using Kalman filter or other methods.
- Stage and task identification can be used separately, in combination or sequence.
- FIG. 4 is a schematic diagram depicting an activity diagram for the system, according to some illustrative embodiments. Note the alternative flow in stage “Access” depending on the bleeding.
- the stages and tasks might come from pre-defined lists or be constructed from scratch or evolve over time.
- the system uses various information to perform this function: visual inputs, audio I/O, outputs from other algorithms (e.g., instrument identification), and/or others.
- Current stage and task can be extracted from different information: persons present and their positions, instruments used, spoken language, noise signature, history of case, comparison to other cases, and/or others.
- the goal of this function is to enable a wide range of assistive tasks and optimizations in the operating room. These may include predicting which instrument can be used next, what is likely to happen and who and how should be notified about it, what equipment is going to be used and whether it is available in the operating room, what are important features for learning, how the load on the operating room is going to vary over time, which schedule is the most optimal according to predefined metrics, which instruments are going to be used most and how to optimize their tables and trays or the instruments themselves taking that into account, how the surgery will move on and when/who is going to be required to intervene, and others.
- the model needs to be trained first.
- Data for training comes from various capabilities of the OR Module: visual input, audio I/O, sensor input, and/or the like.
- input data for training this model might come from other system functions and modes, e.g., instrument identification, event detection, spatial localization, stage and task detection, past schedules and goals, efficiency of previous predictions, and others.
- Data can be manually introduced, too, for example, for initial guess parameters, to correct irregularities or introduce exceptional situations.
- the system could be used for predicting instruments that are going to be used next, with level of probability attached, that is later used in various use cases (e.g. use case Assist with instrumentation and devices)' .
- the training data could be past experiences about estimated/planned OR use, effective use, planned and effective personnel and equipment use, presence of parallel events (e.g., other rooms use, vacation periods etc.), trends, equipment changes, personnel availability and training, and others.
- This model can be used in an interactive mode to allow OR Manager for choosing best planning (ref. use cases Manage schedule, Manage team).
- FIG. 5 depicts an automatic instrument recognition task performed by the system, according to some illustrative embodiments.
- FIG. 5 is a graphical user interface (GUI) that is a VR/AR/MR overlay showing identified instruments to the wearer of a VR/AR/MR headset.
- GUI graphical user interface
- 3D renderings of instrument models can be used. These models are automatically labelled with the correct instrument they were generated from. Viewpoints can be randomly assigned or a standard set of viewpoints that are mostly used in the OR (e.g., top view for ceiling-mounted Visual input). From these viewpoints, a large set of 2D images can be generated and used as training, test, and validation sets. The renders can further simulate real-world examples by changing lighting, texture (e.g., material use), adding stains, and others. Alternatively, sets can come from real -world images, labelled completely or partially. For example, an OR team member could show the instrument to one or multiple cameras, and/or move it in the space until the system informs the user that the instrument has been acquired. Then user could define the label for the instrument and from this moment on, the instrument could be available for identification.
- 3D renderings of instruments can come from instrument design files, be recreated manually, or acquired by specialized personnel or hospital employees using 3D scanners and other means.
- Instrument identification can be used to recognize all the instruments that are handled, lay on tables or come in instrument trays to perform a complete inventory of devices used.
- Instrument identification can also be used to identify instrument use over time, broken parts or failures. This could be achieved by learning of instrument performance over time (e.g., in conjunction with instrument-relevant events), changes in its visual appearance (e.g. de -colorization, parts missing or deformed), usage tracking, or others.
- Speech recognition and synthesis are used to obtain additional sources of information in the OR and simplify interaction with the system. It can be used in conjunction with instrument identification (e.g. to help to identify used instrument or its failure based on exchanges in the OR), to detect user intent, input user request (e.g. for help with an instrument, to notify team member, to request equipment), to detect user intent and equipment during surgery preparation stage, used as an input for constructing user training case (e.g. surgeon asks for instrument X, trainee should be able to quickly identify which instrument is it), analytics of what, when and how is said in the OR, enable precise surgery follow-up (e.g. better stages identification based on what is said and how), and others.
- instrument identification e.g. to help to identify used instrument or its failure based on exchanges in the OR
- input user request e.g. for help with an instrument, to notify team member, to request equipment
- user intent and equipment during surgery preparation stage used as an input for constructing user training case (e.g. surgeon asks for instrument X, trainee should
- Input to speech recognition comes from OR Module Audio I/O, Terminals, or Viewing Stations equipment. This can be specialized equipment or standard consumer-class.
- a speech signal is preprocessed, for example, it is normalized, voice activity is detected, and noise reduced.
- features are extracted. For example, prosodic features using pitch and duration, spectral features using LPC and Formants, and voice quality features using jitter and Teager Energy Operator features. Then, features are selected and reduced for example using principal component analysis and classified, for example using k- Nearest Neighbor.
- Speaker recognition is an important part of the system. The goal of the recognition is to distinguish an individual based on his voice/text sample from the group of known individuals. There are several parameters that enable such recognition, for example pitch, tone, accent, and other pertinent features.
- Various deep learning-based models such as CNNs, RNNs, and their combinations can be used in several subtasks of speech and speaker identification, including verification, identification, diarization, and robust recognition. For example, a new user might be asked during registration to the system to record few voice samples, read example text or react to visible clues. Such samples might be used for training, validation and testing of the model that is then used in the OR. Such models can be further improved with OR-acquired data, for example, by cross section of identification coming from other means, e.g., visual.
- the emotion recognition function is important for assessment of how well the OR team works together, identifications of potential problems with teams composition, training, equipment or emotional compatibility.
- the emotion recognition system identifies the existence of a user’s varied emotions by extracting and classifying the prominent features from available signals.
- two main signals are visual (e.g., facial) and audio (e.g. speech).
- the System may define a set of pre-defined emotions that will be matched with data. For example, these pre-defined emotions could be: Happy, Surprise, Neutral, Sad and Angry. Furthermore, the system may detect various levels of user focus on a task.
- the following algorithm could be used.
- user facial images need to be pre-processed first, for example to remove noise, scale, mask and convert to grayscale.
- a feature extraction algorithm could be used. This could be LBP, Gabor, HOG or others.
- a final feature vector is created using concatenation. To remove unwanted features, dimensional reduction can be used.
- one of the classification algorithms such as SVM, RF, and kNN can be used to classify the AUs (Action Units).
- a classifier programmed to perceive any emotions in new speech signals is needed.
- a categorical emotional model can be built to classify emotions from a predefined set above.
- the database of input data needs to be built.
- this could be a natural database coming from recordings in the OR (e.g., linked with events in the OR, e.g. sterility loss), it can be elicited when specific user behavior is provoked or asked to act.
- the system pre-processes speech signals using framing, normalization and voice activity detection. Specific voice features can be extracted using Prosodic, Spectral, Voice Quality or Teager- Energy Operator models.
- the final classification might be done based on traditional or deep learning classifiers.
- classifiers include Gaussian Mixture Model, Hidden Markov Model, Artificial Neural Network, and Support Vector Machines.
- Some other traditional classification techniques involve k-Nearest Neighbor (kNN), Decision Trees, Naive, Bayes Classifiers, Deep Belief Networks (DBN), Deep Boltzmann Machine (DBM), Restricted Boltzmann Machine (RBM), Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM) and Convolutional Neural Networks (CNN).
- kNN k-Nearest Neighbor
- DBN Deep Belief Networks
- DBM Deep Boltzmann Machine
- RBM Restricted Boltzmann Machine
- RNN Recurrent Neural Networks
- LSTM Long Short-Term Memory
- CNN Convolutional Neural Networks
- a first type of event is linked to contamination risk for the patient or personnel. Examples are incorrect room preparations (e.g. missing sterile drapes), incorrect instrumentation handling (e.g. assembly of instruments, like powered devices, where non-sterile battery needs to be inserted into sterile cover), incorrect aseptic technique when putting on sterile clothes, filling- up liquids (often done in tandem by the Scrub and Circulating Nurse), accidental touching of non-sterile surfaces by sterile equipment (see, for example, FIG. 6), passing too close to sterile equipment (generally 1 m or more is considered safe), instruments falling on the ground, and others.
- incorrect room preparations e.g. missing sterile drapes
- incorrect instrumentation handling e.g. assembly of instruments, like powered devices, where non-sterile battery needs to be inserted into sterile cover
- incorrect aseptic technique when putting on sterile clothes
- filling- up liquids often done in tandem by the Scrub and Circulating Nurse
- a second group of events is more linked to general patient safety and well accepted practice, for example, instrument and sponge count, avoiding excessive patient pressure, and the like.
- Other types of events that are important for safety are correct instruments assembly (especially for precise instruments used e.g., in stereotactic neurosurgery), correct handling (e.g. avoiding hits or excessive pressure on delicate instruments like measurement probes), avoiding accidental hitting of optical/electro-magnetic navigation arrays, and others.
- the System can detect events using Visual and/or Audio I/O inputs. Other system internal functions, for example, spatial localization, can be used for this task, too.
- a selected event type e.g., touching of non-sterile surface, hitting navigation array etc.
- s sequence of sensor streams audio, video and others
- Additional sensors might be used for s specific purpose, e.g., optical tracking for navigation arrays, thermography for liquids, high quality video for surgical supplies count, and the like.
- the sensor stream series might be precisely labelled (e.g., each frame) or coarsely labelled (e.g. happening within 1 min). With this data, models and algorithms are trained, tested, and/or verified.
- Example algorithms used for event detection could be convolution neural networks CNNs (ID or more), histogram of oriented gradients (HOG), histogram of flow (HOF), motion boundary histograms (MBH), dense trajectories, a long short-term memory (LSTM) model and others.
- CNNs ID or more
- HOG histogram of oriented gradients
- HAF histogram of flow
- MH motion boundary histograms
- LSTM long short-term memory
- the model trained can further improve its own functioning automatically or with user feedback. For example, if an event happens, the OR team could inform the system (e.g., by saying "sterility loss", ref. Speech recognition function), and the system can use it to further improve its algorithms.
- Operating rooms are complicated environments that may make training machine learning modules to (e.g., accurately) perform certain functions, such as machine vision tasks, and/or produce (e.g., accurate) audio and/or visual output difficult.
- a user may speak in a manner that can be captured by a microphone in an operating room as one or more events are occurring to provide audio-derived annotations to visual data (e.g., video recordings) of the event(s) for use in training a machine learning module.
- a machine-learning module may be trained (e.g., fine-tuned) using a combination of visual and/or sensor data and corresponding audio data (e.g., where the audio data provide annotations for the visual data), for example in addition to a (e.g., more general) pre-training.
- Such training may be used for event detection and/or stage and/or task identification.
- a machine learning module may be trained using data specific to a particular operating room. Operating rooms may have wide variation that makes general training less preferable or less accurate.
- a goal of spatial localization is to identify three-dimensional (3D) positions of people, their clothes, instruments, and equipment in the operating room. For example, identifying where the person and its clothes are, together with information about surrounding equipment, can help with identifying if a sterility loss event happened due to touching of non- sterile surface. Identifying the position of the OR instruments can help with their assembly by showing how to assembly parts that are currently used and where to place them for quicker reuse. Frequency, position, and speed of manipulation can be used for analytics and identifying how well are instruments used and assessing risk related to their use. Positions of people and instruments can be used to better identify surgery stage (e.g., an assistant with retractors and scalpels ready during patient opening stage).
- surgery stage e.g., an assistant with retractors and scalpels ready during patient opening stage.
- Information about people positions and/or instruments use can be used in user training and assessment. For example, the best location for people and equipment can be identified in most effective surgeries (e.g., the shortest ones with correct outcomes) and subsequently used to train nurses, surgeons and OR team. Similarly, the best positions of equipment and instrumentation can be used in the surgery preparation stage to guide OR staff.
- the 3D scene/model of the OR can be automatically created in real time using LiDAR or machine vision technology (see, for example, FIG. 7).
- spatial transformation e.g., translation and rotation, e.g., represented by a homogenous transform matrix
- scene elements e.g., people, clothes, equipment, instrumentation.
- This can be achieved by on-the-fly registration of fixed or dynamic scene models using affine, scalable, warping, or other registration method.
- an ICP (Iterative Closes Point) algorithm, Hom transformation, or other algorithm can be used for this purpose.
- the algorithms may adapt existing or develop new equipment models based on sensor information.
- an instrument model can be created based on 3D scene information.
- adjustable instruments their models can be automatically updated for detecting the status the instruments are in.
- the localization technique can be based on skeletal approximation.
- a virtual skeleton is associated with or assigned to each person.
- Based on visual information, RFID tags, IMUs (Inertial Measurement Units), position, and configuration of a person is calculated.
- a person’s position can be extracted directly from one or multiple video streams, especially taken from different angles, using standard or machine-learning based algorithms.
- proximity-based technologies where position is assessed based on localization in reference to predefined base position, for example, using automatic ID systems, RFID tags, or Bluetooth.
- the position can be assessed based on visible markers, for example QR codes, processed by one or more cameras and video streams.
- Other methods for estimating people/human positions and postures involve various computer vision and deep neural network algorithms, CNNs or LSTMs.
- 3D meshes of human bodies can be extracted from single images and the tracked over time using video streams.
- Example algorithms involve Human Mesh Recovery, PHALP (Predicting Human Appearance, Location and Pose for tracking), ViT (Vision Transformer) and ViTPose and others.
- Context segmentation is used for determining the nature of an object, person, surface, and the like. For example, sterile surfaces and clothes need to be distinguished from non-sterile, mobile equipment (e.g., instrument tables and trays) from fixed equipment (e.g. OR table attached to the floor), surgical instruments from other devices, certain disposables (e.g. countable items: sponges, rubber bands etc.) from reusables, and the like (see FIG. 8).
- FIG. 8 depicts a graphical user interface showing a VR/AR/MR overlay of an automatic context segmentation performed by the system. This overlay identifies selected devices (instrument table, drapes), their classes (sterile), and people (here, surgeon).
- Information about context segmentation can be used to assist with devices and surgery preparation (e.g., by indicating correct places to prepare instrumentation, tables, and trays), to detect events (e.g., determining which element is sterile and which is not at this concrete surgery stage), for the analytics (use of space, human resources and devices across surgery, OR, institution, or the like, determining improvement points), to follow-up on surgery advancement (presence of specific devices or people in the OR), training (e.g., by indicating how to organize space), and others.
- events e.g., determining which element is sterile and which is not at this concrete surgery stage
- the analytics use of space, human resources and devices across surgery, OR, institution, or the like, determining improvement points
- follow-up on surgery advancement presence of specific devices or people in the OR
- training e.g., by indicating how to organize space
- Context segmentation can be implemented in various ways.
- the system can be taught to recognize the context. This can be done using various machine learning algorithms, trained using labelled training (e.g., video streams with segmented elements), verification, and training sets. This can be done using, for example, machine learning classification algorithms.
- Another option is to use distinguishing features to mark specific classes of objects. For example, sterile surfaces are usually covered with a specific color drape in the OR.
- Another option is by tracking the object and recognizing its class from type and history of its use (e.g., items taken out from sterilization trays are sterile, e.g., see the Section on Spatial localization).
- a combination of various methods can be used to increase coverage or reliability of the system. In such case, a Kalman filter can be used to consider different measurement and analysis streams.
- a sterile surface can become considered as non-sterile at the end of surgery, during clean-up, and after patient close (e.g., see the Section on Stage and task identification).
- video streaming a stream of video, audio, or data (e.g., haptic data) or any combination of these is understood.
- This streaming capacity can be achieved using any of the appropriate OR Module devices, e.g., Visual Input, Audio I/O, and/or Additional Sensors.
- the stream can be processed using algorithms.
- audio can be processed to improve voice clarity (e.g., by using compressors, limiter, equalizers and others).
- video can be processed to anonymize persons, patients and remove sensitive information. This can be achieved using automatic algorithms, based on machine learning and standard approaches, that identify such information in the stream and proceed with its removal or coverage.
- Video streaming can be used for communication with other interested parties.
- the surgeon can communicate with other specialists to share knowledge and assist with difficult case resolution.
- the video snippets can be used to document a particular situation in the OR, e.g., sterility loss due to touching non-sterile device, and reused for team information or training. Repetitive situations can be case-matched and used for training, analytics, or documentation.
- Video recordings can be used for documenting best practices later shown during the training stage. Situation and person learning
- Situation and person learning means gathering knowledge about how specific tasks in the OR are achieved with a focus on process, people, equipment, and timeline. For example, this could involve defining what is the principal process for the specific surgeon (e.g., Dr. Dee Fault) to conduct specific surgery (e.g. femur osteotomy). It would contain a sequence of tasks (e.g., incision, measurement, initial cut, measurement, second cut, and the like), with alternative flows (e.g., bone fissure detection), time lengths, instruments involved, people involved, information consulted, outcome impact, and the like.
- tasks e.g., incision, measurement, initial cut, measurement, second cut, and the like
- alternative flows e.g., bone fissure detection
- This information could be used in the surgery preparation stage by making sure that all the required instrumentation and devices are ready. Situation and person learning is important for preparing and getting a surgeon-/surgery-/OR-/equipment-specific customized training. It can be used to proactively request equipment that is going to be used (ref. use case Request equipment). Another potential use involves simulating OR load using actual case models and help with OR schedule management (ref. OR Manager use cases).
- Learning could be implemented building on outputs from other System algorithms, for example, the following: Person identification, Stage and task identification, Instrument identification, Event detection, Spatial localization, Context segmentation and others. Data from these algorithms can be related together to create training, validation and testing datasets. These datasets can be used for training machine learning algorithms. These algorithms could involve CNNs, RNNs, LSTM networks, GAN networks, Principal Component Analysis, Support Vector Machines, and others.
- the goal of this function is to generate a surgeon-/surgery-/OR-/equipment- specific customized training for concerned stakeholder(s) (e.g., Nurse, Scrub, Surgical Physician Assistant, Surgeon, or the like).
- This function may use input information from other modules, for example, those described in Situation and person learning Section.
- FIG. 9 is a screenshot of a graphical user interface showing an example of the instrument learning function, according to some illustrative embodiments, where the user is presented with various instruments and needs to correctly assign their names.
- the generated training could be achieved by showing instruments to the users and asking to choose right name, e.g., from the list or to say it aloud.
- the user could be presented with real-life demonstration of instruments assembly, excerpts from surgery (see the Section Video streaming) or pages or materials adapted from the instrumentation manuals.
- FIG. 10 is a screenshot of a graphical user interface showing an example instrument table layouts learning function, according to some illustrative embodiments. Trainee is presented with various instruments and probability of their use at this surgery stage. Based on this, she/he designs most adequate table layout. An example or most optimal layout is used for reference. Layouts could come from other system models, for example, Learning, to optimize the speed and frequency of instrument preparation.
- FIG. 11 is a screenshot of a graphical user interface showing an example for training of instrumentation-task assignment, according to some illustrative embodiments.
- the user can be presented with still images, video streams, overlays, and others, from actual surgery.
- the learning of stages and tasks that are part of the surgery could be divided into several levels: stages (e.g., opening, cutting, implant placement), subtasks (incision, bleeding control, tissue separation), and tasks (cut through skin, prepare electrosurgical device, and the like).
- Each stage of the training could be divided into a generic part, applicable to many surgeries, and specific parts that are more surgeon-/OR-/equipment-specific. For example, the trainee might learn that in most cases surgeon Dr. Dee Fault skips through certain parts while other surgeons do it differently.
- the goal of this stage is to assess how well a person fits with different tasks he/she can be assigned to, define improvement areas, and help with team and schedule management. This information could be used to obtain specific, person-related analytics and insights, could be used for learning and pro-actively predicting course of action in the OR, simulating load and managing schedule to have the most efficient resource utilization, to compose the best performing teams, and to manage personal trainings.
- Personnel assessment could be achieved by analyzing personal behavior from the operating room and outcome from System algorithms. For example, the information on speed and correctness of preparation, instrument waiting time, events frequency, presence and type of tasks performed could be used (see relevant Sections describing these functions and use cases).
- FIG. 12 is a screenshot of a graphical user interface showing an example employee dashboard showing OR/Surgery performance, training compliance, and task assessment, according to some illustrative embodiments. From the difference of required and actual task assessment, a training program can be suggested and schedule adapted (see adequate buttons). The dashboard can be compared to other data, e.g., institution average and best scores, other institutions, or the like.
- FIG. 13 is a schematic diagram illustrating Scrub Nurse use cases facilitated by the system, according to some illustrative embodiments.
- the illustrative “Assist with device and instrumentation” use case performed by the system helps the Scrub Nurse to prepare and organize for the most efficient instrumentation handling (see FIG. 14).
- FIG. 14 is a screenshot of a graphical user interface showing example assistance shown during surgery, according to some illustrative embodiments. The following items are noted in the screenshot of FIG. 14: ( 1 ) the system identified possible improvement in instrument table layout and suggests it to the user together with example reference layout; and (2) the system identified an instrument missing, which has high probability of usage in coming instances, stages and tasks.
- a goal of the Scrub Nurse Use Cases is to reduce to minimum the Surgeon’s waiting time for the instrument, implant, device, or any other equipment (called, collectively, instruments). This is achieved thanks to System awareness of current surgery stage and task.
- This information using the Predictive analytics, is used to identify which stage will follow and which instruments are going to be used. Instrument and equipment identification is used to assess if relevant instruments are already prepared. Each instrument can have a probability use score attached to better judge priorities.
- Projector and Audio I/O can be used. For example, Projector might highlight the instrument with laser pointer, with rear projection or with spotlight, might show the instrument on the OR Screen, highlight through AR Headset, announce through speaker, or the like.
- instrument table set-up to optimize instrument handling time might be communicated to the scrub. Such set-up could be calculated based on probability of instrument use, instrument size, assembly, and the like. For communication, any projection means or OR Screen might be used.
- a goal of the "Obtain help” use case is to provide immediate help to the scrub concerning instrument assembly and use. Many instruments, from various manufacturers and systems, are being used in the operating room and it is increasingly complex to retain how they are meant to be operated.
- the System based on current surgery stage and task, as well as instruments shown by the nurse (e.g., handheld towards Visual Input), shows instruction or help for assembly and instrument use (see FIG. 15 and FIG. 16). This information can be shown with Projector, OR Screen, AR Headset or others.
- the Scrub or other user can further interact with system asking questions and giving speech feedback or instructions. For example., they might ask for clarity on a particular step, show assembly animation, or view from a different angle.
- the help contents might be provided by instrumentation manufacturer, company providing the System or by OR Staff themselves.
- FIG. 15 depicts a graphical user interface showing a VR/AR/MR overlay depicting an OR Staff member showing an instrument to the system to request assistance in its identification and/or use, with the automatically determined identification shown in the overlay, along with a probability that said determination is accurate, according to some illustrative embodiments.
- FIG. 16 is a screenshot of a graphical user interface showing assistance that is presented based on a detected instrument, surgery stage, staff exchange, or other trigger, according to some illustrative embodiments.
- Example assistance shows what needs to be done for instrument exchange and assembly. Still pictures, descriptions, short videos, animations, or the like, can be used.
- the Detect events use case helps with detection of surgery-important events. These could be potential sterility loss due to incorrect aseptic technique (e.g., touching of non- sterile surface), incorrect instrument assembly, unintentional hits of the navigation markers (fiducials attached to patient anatomy to enable tracking), and others. This is achieved thanks to spatial localization of persons and objects linked with context segmentation (sterile/non- sterile) and stage and task identification (e.g., at the end of the surgery all the instruments can be safely scrapped by a non-sterile person).
- some events need to be communicated to the OR Staff (e.g., sterility loss). This could be done via OR Screen, projector, sound alarm, or others. After obtaining staff attention, the event can be shown (e.g., the moment when sterile clothing accidentally touched non-sterile wall) for the OR Staff to decide on handling the issue. It should be possible to acknowledge certain events. Events should be logged for further analysis.
- the Communicate use case enables OR Staff to simply communicate between each other or with external people. This can be a simple voice communication, or more advanced including video streams or better situation awareness (e.g., depending on surgery stage, person triggering the communication etc.). This information can be used to better define to whom the communication is targeted and what is being asked.
- the Get trained use case enables transfer of user and situation-specific knowledge to the trainee.
- Training is procedure- and surgeon- specific as well as adapted to the trainee (nurse, scrub, particular person). It uses past experiences, events, instruments and equipment, timing information, video and audio streams, target audience, and the like to generate training goals and training content.
- Training can be viewed on Viewing Station in a series of learning and training materials.
- Final result assessment can be provided to assess candidate and training effectiveness. Additionally, this effectiveness can be assessed during follow-up cases in the OR.
- the training can be done on Terminal, Viewing Station, or a specific training device.
- a specific training device could be a flat-lying touchscreen with PC, AR (Augmented Reality) or VR (Virtual Reality) devices (or MR mixed reality devices), haptic devices, holographic devices and projectors, laser projectors, joysticks, haptic devices, phantom devices, wearables, IMUs (Inertial Measurement Unit) devices, training instrument sets, and others.
- AR Augmented Reality
- VR Virtual Reality
- MR mixed reality devices or MR mixed reality devices
- haptic devices holographic devices and projectors
- laser projectors joysticks
- haptic devices phantom devices
- wearables wearables
- IMUs Inertial Measurement Unit
- the follow-up use case enables users to be up to date with operating room events. For example, it allows OR Staff to concentrate on other activities or take rest until relevant stage in the OR is achieved and when their presence is required.
- the Request equipment use cases are requests for a Nurse to bring equipment to the OR, emergency assistance requests, automatic reactions to events (e.g., instrument sterility is compromised and new instrument needs to brought in the room). Notifications can be shown on Terminals or Viewing Stations. This function is enabled by automatic stage and task identification and its outcomes can be automatically assessed (e.g. has required action happened on time) and used for future improvements.
- Request equipment is used based on user trigger or automatically ask for bringing specific equipment to the OR. The goal is to always have required instrumentation and equipment in the OR. Requests can be generated automatically based on surgery stage and current inventory in the OR (e.g., automatically assessed), previous learning, events, and the like.
- Requests can be user-triggered, for example simple voice command, for example, “Ask the nurse to bring new scalpels” could be directly text coded “Surgeon X requests new scalpels in block number Y” and sent to relevant person’s terminal (here Nurse Terminal).
- the Get analytics use case allows the surgeon to assess his and his team’s performance. For example, this would be reviewed in an office setting, on a Viewing Station. Based on intra-operative data, the surgeon could review several graphs and indicators, for example: effective OR, effective OR time vs. planned time, preparation time, instrumentation waiting time, ambience/mood in the OR, event statistics (detected events, number of them, and the like), surgery stages statistics, personnel statistics (who, where, when, etc.), trends, costs, and others . Surgeon could have a possibility to compare his data to a wider pool of data coming from other surgeons in the same institution, other hospitals, other geographies and worldwide.
- the use case enables real-time follow-up of the surgeries (see FIG. 18).
- the follow-up is done automatically or semi-automatically with some user participation.
- Fellow-up information can be communicated using scales, advancement graphs, textual messages, logs, event reports, comparison to statistical results, goals, and the like, in order to keep all the stakeholders up to date. Expected duration of each stage, based on previous experiences and learning, can be indicated.
- Follow-up can be done on Terminal, Viewing Station or other device. This use case can be linked to Get notified, for example to include alarms, reminders and others.
- FIG. 18 is a screenshot of a graphical user interface presented on a mobile device, the screenshot showing a follow-up visible on a Surgeon Terminal in a Surgeon use case, according to some illustrative embodiments.
- the user has a clear view of current surgery stage, can consult video stream, and call OR staff. He has a clear overview of when his involvement begins and can plan accordingly. Alarms can be set for predefined stages, here, one minute before expected cutting.
- the use case Learn is happening in the background and concerns the System rather than the surgeon.
- the System constantly gathers data about persons, equipment, instruments, events, exchanges, visual and audio, mood, timing, spatial, and others. This data is later used for analytics, obtaining intelligence, predictive simulations, training, and others.
- Circulating Nurse, or Nurse, use cases enabled by the System are shown in FIG. 19.
- Circulating Nurse use cases Assist with surgery preparation, Detect events, Get notified, and Request equipment are similar to use cases described in previous Sections. These use cases should be adapted to the specific needs and role of the Nurse. For example, she can bring equipment based on the request from the OR (see FIG. 20). She can receive messages, voice or video calls on her terminal. The messages can be automatically acknowledged and checked when requested equipment is in the OR. The Nurse participates actively in surgery preparation. In this use case, she can directly receive information about what needs to be prepared and where particular attention needs to be paid, on her terminal or OR Screen. As mentioned in previous Sections, surgery preparation is based on dynamic lists that adapt over time to surgery type, surgeon, personnel and others.
- FIG. 20 is a screenshot of a graphical user interface presented on a mobile device, the screenshot showing a Nurse Request Equipment use case visible on a Nurse Terminal in a Nurse use case, according to some illustrative embodiments. It shows overview of what is requested in each OR, together with an estimation of when it is going to be used. Color-coding, alarms, and ordering can be used to communicate priority. Expected equipment, together with attached probability, can be foreseen by the System based on previous experiences, surgery execution and detected events. [0165] The OR Manage use cases enabled by the System are shown in FIG. 21. OR Manger use cases Follow-up, Get notified are similar to the use cases described in previous Sections. These use cases should be adapted to specific OR Manager needs, for example, by showing data of interest for him (e.g., whole team performance, OR global workload, and the like), notifications about specific events, and others.
- data of interest for him e.g., whole team performance, OR global workload, and the like
- Use case Simulate load allows OR Manager to simulate OR load for predefined future period of time (e.g., week, month, quarter, and the like). By the load, many factors are considered like OR effective use, probability of delays, readiness, personnel adequacy, expected timings, spare capacity, and others. This use case allows user to effectively simulate impact of various parameters like personnel changes, equipment, new treatments, new approaches, training needs, and the like. Data for simulation, especially for new areas for particular institution, can come from other institutions, physicians, settings, previous history, and others. This use case is an effective support for decision making from both clinical and health economics point of view.
- Use case Manage schedule makes use of the Simulate load and other information to help OR schedule management. It is a known problem that OR schedule management has important challenges due to many non-obvious impact factors (patient, surgeon, OR team, equipment etc.) that are hard to consider. By integration of simulation factor and pertinent learning from previous experiences with the team, equipment and location, more precise and better corresponding to reality schedules can be constructed. Also, schedule management can take different time constraints into consideration (e.g., surgeon wants to operate between 7h and 16h, nurse without long over-hours, anesthesia in yet different time slots and administration 24h a day).
- Use case Manage team allows team effectiveness assessment in the light of current and future tasks, schedule, changes, and the like. It gives a comparison between required and best skills and the available ones. It allows to identify team gaps in quantity and/or training. It allows to assess impact for personnel changes and training.
- the implementation can be achieved, for example, using a heatmap representation with required/available skills, timeline, and workload updated in real time depending on schedule changes, suggested team and shift allocations, suggested trainings to optimize team adequacy, and others. It allows observation of trends in personal development over time.
- OR Manager can assign specific training to the user. In order to choose the best training, personal assessment of the user can be consulted, compared to the rest of the team (see FIG. 12 and use case Manage team) and schedule (use case Manage schedule) and training availability (use case Get trained).
- the Get analytics use case allows Administrator to assess his institutions and its members performance. For example, this would be reviewed in an office setting, on a Viewing Station. Based on data, Administrator could review several graphs and indicators, for example: effective OR, effective OR time vs. planned time, preparation time, instrumentation waiting time, ambience/mood in the OR, event statistics (detected events, number of them, and the like), surgery stages statistics, personnel statistics (who, where, when, etc.), trends, costs analysis, and others. Administrator could have a possibility to compare his data to a wider pool of data coming from other units in the same institution, other hospitals, other geographies, and worldwide.
- Collected data about the OR and instrumentation use statistics can be of important value for industry, hospitals, and administrators.
- usage statistics can serve as important usability information, events log for risk management, cost analysis for health economics, and the like. This data in raw, pre-processed, or synthesized form can be provided to these institutions.
- VR virtual reality
- AR augmented reality
- MR mixed reality
- head-worn and/or face-worn hardware as well as eyetracking software, and may be used as components of the systems and methods described herein.
- headset refer broadly to such head- or face-worn systems that implement virtual reality, augmented reality, and/or mixed reality functionality.
- the commercial VR/AR headset system has eye-tracking capability.
- FIG. 10 An example of a professional-grade VR/AR headset system with eye tracking capability that can be used with the systems and methods described herein is the VIVE Pro Eye Office VR system, manufactured by HTC Corporation (headquartered in Xindian, New Taipei, Taiwan), as described at https://business.vive.com/us/product/vive-pro-eye-office/ and in U.S. Patent No. 10,990,170, entitled, “Eye tracking method, electronic device, and non-transitory computer readable storage medium,” and in U.S. Patent No. 10,705,604, entitled, “Eye tracking apparatus and light source control method thereof,” the disclosure of each of which is incorporated herein by reference in its entirety.
- Varjo Aero manufactured by Varjo Technologies Oy (Helsinki, FI).
- a Varjo Aero may be used to track eye movement of a user while rendering and displaying one or more screens, graphics, and/or widgets to the user, for example for object identification and/or tracking for the user.
- Virtual-, augmented-, and/or mixed-reality devices may be used in various embodiments described herein, for example to provide a mechanism with which to show a user where in a setting an object is located and/or to track an object in a setting for a user (e.g., for object detection, e.g., real-time object detection).
- object detection e.g., real-time object detection
- VR virtual reality
- AR augmented reality
- MR mixed reality
- the terms “headset” or “VR headset” refer broadly to such head- or face-worn systems that implement virtual reality, augmented reality, and/or mixed reality functionality.
- a virtual-, augmented-, and/or mixed- reality device e.g., VR headset
- overlay e.g., semi-translucent coloring
- An virtual-, augmented-, and/or mixed-reality devices may be used to perform one or more machine vision tasks for a user (e.g., to provide indications of where objects are and/or track objects in a setting, such as, for example, surgical or medical tools in a healthcare setting, such as an operating room) as disclosed herein.
- a virtual-, augmented-, and/or mixed-reality device e.g., VR headset
- a virtual-, augmented-, and/or mixed-reality device may be a consumer grade device.
- a virtual-, augmented-, and/or mixed-reality device e.g., VR headset
- FIG. 10 An example of a professional-grade VR headset system with eye tracking capability that can be used with the systems and methods described herein is the VIVE Pro Eye Office VR system, manufactured by HTC Corporation (headquartered in Xindian, New Taipei, Taiwan), as described at https://business.vive.com/us/product/vive-pro-eye-office/ and in U.S. Patent No. 10,990,170, entitled, “Eye tracking method, electronic device, and non-transitory computer readable storage medium,” and in U.S. Patent No. 10,705,604, entitled, “Eye tracking apparatus and light source control method thereof,” the disclosure of each of which is incorporated herein by reference in its entirety.
- Varjo Aero manufactured by Varjo Technologies Oy (Helsinki, FI).
- a Varjo Aero may be used to track eye movement of a user while rendering and displaying one or more screens, graphics, and/or widgets to the user, for example for object identification and/or tracking for the user.
- Fig. 24A shows an illustrative system 2400 that can perform methods described herein.
- the illustrative system 2400 includes a memory 2402 on which instructions are stored that, when executed by processor 2404, perform one or more methods described herein.
- the system 2400 can include a VR headset 2410, for example from which a first and second data stream corresponding to a gaze direction and gaze origin are sent to, and received by, the processor 2404 for use in executing instructions stored on the memory 2402.
- Fig. 24B shows a detailed block diagram of components that may be included in the VR headset 2410, for example if the VR headset 2410 is a VIVE Pro Eye Office VR system.
- the components may include one or more of (i) a camera 2412 for tracking an eye of a user; (ii) one or more illumination sources LSI, LS2, . . .
- LSN for illuminating an eye of the user to provide signal to the camera 2412 in order to track the eye;
- one or more optics 2416 such as lenses, reflectors, or other light guiding components, for guiding light that has interacted with the eye (e.g., reflected from the eye) to the camera 2412;
- a display 420 for displaying images to the user;
- one or more headset processors 2414 for processing data from camera 2412, for the display 420, or from and/or for other components in the VR headset 2410; and
- one or more adjustment and/or head support mechanisms 422 for physically adjusting (e.g., orienting and/or aligning) the VR headset on a user (e.g., relative to an eye of the user and/or for comfort during use).
- the VR headset 2410 is a wearable apparatus that may be worn over one or both eyes of a user at a time.
- the VIVE Pro Eye Office VR system is worn over both eyes but other VR headsets that can be used may have a “monocle” style that is worn over one eye at a time.
- the display 420 may also be used to provide one or more graphics and/or widgets (e.g., simulated images) to the user.
- the processor 2404 that executes instructions may be one of the headset processors 2414.
- the memory 2402 and the processor 2404 may be housed in the VR headset 2410 or may be separately housed, for example in a server or other computing device that is in communication (e.g., wireless communication) with the VR headset 2410.
- the one or more headset processors 2414 may be used to send data stream(s) to the processor 2402, for example wirelessly.
- the VR headset 2410 may also include one or more adjustment and/or head support mechanisms 422.
- Adjustment mechanisms 422 may include one or more mechanical mechanisms, such as knobs, straps, dials, or the like.
- the adjustment mechanisms 422 may be used to adjust the horizontal and/or vertical position of the headset (e.g., a component thereof, such as the display 420) relative to the head of a user.
- the VIVE Pro Eye Office VR system includes mechanism(s) to adjust the interpupillary distance (IPD) to the particular user using the system by an “IPD knob.” IPD adjustment may involve first determining a physical IPD measurement, for example manually by the user or with assistance from another person.
- the VIVE Pro Eye Office VR system includes a lens distance adjustment button that can be pressed to allow a user to adjust the distance of the lens further or closer to his or her face. Such an adjustment may be used to account for user anatomy or other factors, such as glasses or other sight aids. A user may be prompted to make adjustment of one or more of the adjustment mechanism(s) 422 based on a virtual alignment aid to determine whether the VR headset 2410 is properly aligned and/or oriented.
- Head support mechanisms 422 may include one or more physical structure(s) such as strap(s), mount(s), brace(s), padding, or the like.
- the physical structure(s) may be adjustable (e.g., a hook and loop fastener or elastic strap) or compliant (e.g., foam padding) or both (e.g., an adjustable strap with padding).
- the VIVE Pro Eye Office VR system includes a replaceable face cushion that provides compliant support around a user’s eyes for comfort as well as a head pad, adjustment dial, and center strap that collectively secure the system to a user’s head with the adjustment dial being part of the head pad that sits on the back of the head and center strap that runs over the top of the head.
- the adjustment dial can adjust the tension of the center strap and the center strap also has a hook and loop fastener for easy mounting and dismounting from the head.
- a particular mechanism e.g., structure
- a strap may be used to secure a VR headset to a wearer and may also be used to adjust a physical position of the VR headset to the wearer’s eyes.
- Fig. 24C shows a schematic of how the illustrative VR headset 2410 can be used to track an eye 2401 of a user.
- Illumination sources light sources
- LSI, LS2, . . . , LSN provide light to the eye 2401.
- Light is received by reflector 2416 from the eye 2401 after illumination and reflected toward the camera 2412 where it is detected and the processed using a headset processor 2414 that is part of a controller, which refers to lookup table 2418.
- the display 420 simultaneously displays image(s) to the user, for example to prompt eye movement or a particular focus of the user in order to orient or track the eye 2401.
- the illumination sources LS 1 , LS2, . . . , LSN project a plurality of light beams to the eye 2401 on a target zone.
- the light reflection device 2416 receives and reflects the display image IMG of the eye 2401 to the camera 2412.
- the controller with headset processor 2414 is coupled to the camera 2412 and the illumination sources LSI, LS2, ... , LSN.
- a headset processor 2414 receives the display image IMG and analyzes the contrast ratio of the display image IMG.
- the headset processor 2414 additionally generates the command signal DS through a result of the analysis, and controls the turning on or turning off states of each of the illumination sources LSI, LS2, ...
- the lookup table 2418 is configured to store the relationship between the turning on/tuming off states of the illumination sources LSI, LS2, ... , LSN and the field of view information of the eyeball.
- the lookup table 2418 may be implemented as a memory of any suitable form, which will be apparent to those of skill in the art.
- the lookup table 2418 may be external to the controller with the headset processor 2414 and coupled to the controller. Alternatively, the lookup table 2418 may also be embedded in the controller with the headset processor 2414.
- a virtual-, augmented-, and/or mixed-reality device may include one or more cameras that capture real time images (e.g., in a video feed) of at least a portion of a field of view of a user.
- a virtual-, augmented-, and/or mixed-reality device can provide images to a machine vision module (e.g., included in the device or communicatively coupled thereto) in order for the module (e.g., a machine learning model therein) to use the images as input to perform one or more machine vision tasks (e.g., detection, identification, and/or tracking of one or more objects) (e.g., object detection).
- a machine vision module e.g., included in the device or communicatively coupled thereto
- the module e.g., a machine learning model therein
- Output of the module may be rendered and displayed to a user by a virtual-, augmented-, and/or mixed-reality device.
- a machine vision module may take output from a machine learning model (e.g., object identity and/or position) and process it, for example generating one or more graphics and/or widgets.
- a virtual-, augmented-, and/or mixed-reality device may be or include a headset or be or include another display (e.g., be or include a window through which a user views a scene in the user’s field of view) (e.g., a head-up display).
- a user will naturally be looking through a transparent window, such as for example a fume hood in a research or laboratory setting or a protective window on a piece of construction equipment (e.g., a crane or front loader).
- a virtual-, augmented-, and/or mixed-reality device may include or be integrated with such transparent windows.
- the software instructions include a machine learning module, also referred to herein as artificial intelligence software.
- a machine learning module refers to a computer implemented process (e.g., a software function) that implements one or more specific machine learning algorithms, such as an artificial neural network (ANN), a convolutional neural network (CNN), random forest, decision trees, support vector machines, and the like, in order to determine, for a given input, one or more output values.
- the input comprises numerical data, tagged data, and/or functional relationships.
- the input comprises image data and/or alphanumeric data which can include 2D and/or 3D datasets, numbers, words, phrases, or lengthier strings, for example.
- the one or more output values comprise image data (e.g. 2D and/or 3D datasets) and/or values representing numeric values, words, phrases, or other alphanumeric strings.
- the input comprises alphanumeric data which can include numbers, words, phrases, or lengthier strings, for example.
- the one or more output values comprise values representing numeric values, words, phrases, or other alphanumeric strings.
- the machine learning module may include natural language processing (NLP) to allow it to automatically analyze an input alphanumeric string(s) to determine output values classifying a content of the text (e.g., an intent), e.g., as in natural language understanding (NLU).
- NLP natural language processing
- a machine learning module may receive as input a textual string (e.g., entered by a human user, for example) and generate various outputs.
- the machine learning module may automatically analyze the input alphanumeric string(s) to determine output values classifying a content of the text (e.g., an intent), e.g., as in natural language understanding (NLU).
- NLU natural language understanding
- a textual string is analyzed to generate and/or retrieve an output alphanumeric string.
- machine learning modules implementing machine learning techniques are trained, for example, using datasets that include categories of data described herein. Such training may be used to determine various parameters of machine learning algorithms implemented by a machine learning module, such as weights associated with layers in neural networks.
- a machine learning module is trained, e.g., to accomplish a specific task such as identifying certain output strings, values of determined parameters are fixed and the (e.g., unchanging, static) machine learning module is used to process new data (e.g., different from the training data) and accomplish its trained task without further updates to its parameters (e.g., the machine learning module does not receive feedback and/or updates).
- machine learning modules implementing machine learning techniques may be composed of individual nodes (e.g. units, neurons).
- a node may receive a set of inputs that may include at least a portion of a given input data for the machine learning module and/or at least one output of another node.
- a node may have at least one parameter to apply and/or a set of instructions to perform (e.g., mathematical functions to execute) over the set of inputs.
- node instructions may include a step to provide various relative importance to the set of inputs using various parameters, such as weights.
- the weights may be applied by performing scalar multiplication (e.g., or other mathematical function) between a set of inputs values and the parameters, resulting in a set of weighted inputs.
- a node may have a transfer function to combine the set of weighted inputs into one output value.
- a transfer function may be implemented by a summation of all the weighted inputs and the addition of an offset (e.g., bias) value.
- a node may have an activation function to introduce non-linearity into the output value.
- Non-limiting examples of the activation function include Rectified Linear Activation (ReLu), logistic (e.g., sigmoid), hyperbolic tangent (tanh), and softmax.
- a node may have a capability of remembering previous states (e.g., recurrent nodes). Previous states may be applied to the input and output values using a set of learning parameters.
- the machine learning module comprises a deep learning architecture composed of nodes organized into layers.
- a layer is a set of nodes that receives data input (e.g., weighted or non-weighted input), transforms it (e.g., by carrying out instructions, e.g., applying a set of functions e.g., linear and/or non-linear functions), and passes transformed values as output (e.g., to the next layer).
- the set of nodes in a particular layer may share the same parameters and instructions without interacting with each other.
- a machine learning module may be composed of at least one layer (e.g., ordered).
- Examples of types of layers include convolutional layers (e.g., layers with a kernel, a matrix of parameters that is slid across an input to be multiplied with multiple input values to reduce them to a single output value); fully connected (FC) layers (e.g.
- convolutional layers e.g., layers with a kernel, a matrix of parameters that is slid across an input to be multiplied with multiple input values to reduce them to a single output value
- FC layers e.g.
- recurrent layers long/short term memory (LSTM) layers, gated recurrent unit (GRU) layers (e.g., nodes with the various abilities to memorize and apply their previous inputs and/or outputs); batch normalization (BN) layers (e.g., layers that normalize a set of outputs from another layer, allowing for more independent learning of individual layers); activation layers (e.g., layers with nodes that only contain an activation function); and/or (un)pooling layers [e.g., layers that reduce (increase) dimensions of an input by summarizing (splitting) input values in defined patches).
- BN batch normalization
- activation layers e.g., layers with nodes that only contain an activation function
- unpooling layers e.g., layers that reduce (increase) dimensions of an input by summarizing (splitting) input values in defined patches).
- the performance of a machine learning module may be characterized by its ability to produce an output data with specific accuracy.
- a training process is performed to find optimal parameters, such as weights, for each node in each layer of the machine learning module.
- the training process of a machine learning module may involve using output data to calculate an objective function (e.g., cost function, loss function, error function) that needs to be optimized (e.g., minimized, maximized).
- an objective function e.g., cost function, loss function, error function
- a machine learning objective function may be a combination of a loss function and regularization parameter. The loss function is related to how well the output is able to predict the input.
- the loss function may take various forms, like mean squared error, mean absolute error, binary cross-entropy, categorical cross-entropy, for example.
- the regularization term may be needed to prevent overfitting and improve generalization of the training process. Examples of regularization techniques include LI Regularization or Lasso Regression, L2 Regularization or Ridge Regression, and Dropout (e.g., dropping layer outputs at random during training process).
- objective function optimization of a machine learning module may involve finding at least one (e.g., all) of the present global optima (e.g., as opposed to local optima).
- the algorithm for objective function optimization follows principles of mathematical optimization for a multi-variable function and relies on achieving specific accuracy of the process. Examples of objective function optimization algorithms include gradient descent, nonlinear conjugate gradient, random search, Levenberg- Marquardt algorithm, limited-memory Broyden-Fietcher-Goldfarb-Shanno algorithm, pattern search, basin hopping method, Krylov method, Adam method, genetic algorithm, particle swarm optimization, surrogate optimization, and simulated annealing.
- the machine learning modules comprises one or more image-based segmentation neural networks.
- segmentation neural networks include, for instance, Deep Image Matting (DIM), semantic segmentation methods (U-Net, DeepLab Series), Mask R-CNN, Chroma Keying and/or Luminance Keying CNNs, RefmeNet, and MODNet.
- the machine learning modules comprise one of more generative Al modules.
- generative Al leverages complex neural networks and algorithms to understand patterns and produce output that mimic human creativity.
- Examples of generative Al modules include image synthesis models (e.g., DALL-E3, DALL-E2, Imagen 3 in Gemini, Craiyon, and the like) and text generation models (e.g., ChatGPT, GPT-4, and the like).
- the cloud computing environment 2500 may include one or more resource providers 2502a, 2502b, 2502c (collectively, 2502). Each resource provider 2502 may include computing resources.
- computing resources may include any hardware and/or software used to process data.
- computing resources may include hardware and/or software capable of executing algorithms, computer programs, and/or computer applications.
- illustrative computing resources may include application servers and/or databases with storage and retrieval capabilities.
- Each resource provider 2502 may be connected to any other resource provider 2502 in the cloud computing environment 2500.
- the resource providers 2502 may be connected over a computer network 2508.
- Each resource provider 2502 may be connected to one or more computing device 2504a, 2504b, 2504c (collectively, 2504), over the computer network 2508.
- the cloud computing environment 2500 may include a resource manager 2506.
- the resource manager 2506 may be connected to the resource providers 2502 and the computing devices 2504 over the computer network 2508.
- the resource manager 2506 may facilitate the provision of computing resources by one or more resource providers 2502 to one or more computing devices 2504.
- the resource manager 2506 may receive a request for a computing resource from a particular computing device 2504.
- the resource manager 2506 may identify one or more resource providers 2502 capable of providing the computing resource requested by the computing device 2504.
- the resource manager 2506 may select a resource provider 2502 to provide the computing resource.
- the resource manager 2506 may facilitate a connection between the resource provider 2502 and a particular computing device 2504.
- the resource manager 2506 may establish a connection between a particular resource provider 2502 and a particular computing device 2504. In some implementations, the resource manager 2506 may redirect a particular computing device 2504 to a particular resource provider 2502 with the requested computing resource.
- FIG. 26 shows an example of a computing device 2600 and a mobile computing device 2650 that can be used in the methods and systems described in this disclosure.
- the computing device 2600 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers.
- the mobile computing device 2650 is intended to represent various forms of mobile devices, such as personal digital assistants, cellular telephones, smartphones, and other similar computing devices.
- the components shown here, their connections and relationships, and their functions, are meant to be examples only, and are not meant to be limiting.
- the computing device 2600 includes a processor 2602, a memory 2604, a storage device 2606, a high-speed interface 2608 connecting to the memory 2604 and multiple high-speed expansion ports 2610, and a low-speed interface 2612 connecting to a low-speed expansion port 2614 and the storage device 2606.
- Each of the processor 2602, the memory 2604, the storage device 2606, the high-speed interface 2608, the high-speed expansion ports 2610, and the low-speed interface 2612 are interconnected using various busses, and may be mounted on a common motherboard or in other manners as appropriate.
- the processor 2602 can process instructions for execution within the computing device 2600, including instructions stored in the memory 2604 or on the storage device 2606 to display graphical information for a GUI on an external input/output device, such as a display 2616 coupled to the high-speed interface 2608.
- multiple processors and/or multiple buses may be used, as appropriate, along with multiple memories and types of memory.
- multiple computing devices may be connected, with each device providing portions of the necessary operations (e.g., as a server bank, a group of blade servers, or a multi-processor system).
- multiple computing devices may be connected, with each device providing portions of the necessary operations (e.g., as a server bank, a group of blade servers, or a multi-processor system).
- a processor any number of processors (e.g., one or more processors) of any number of computing devices (e.g., one or more computing devices).
- a function is described as being performed by “a processor”
- this encompasses embodiments wherein the function is performed by any number of processors (e.g., one or more processors) of any number of computing devices (e.g., one or more computing devices) (e.g., in a distributed computing system).
- the memory 2604 stores information within the computing device 2600.
- the memory 2604 is a volatile memory unit or units.
- the memory 2604 is a non-volatile memory unit or units.
- the memory 2604 may also be another form of computer-readable medium, such as a magnetic or optical disk.
- the storage device 2606 is capable of providing mass storage for the computing device 2600.
- the storage device 2606 may be or contain a computer- readable medium, such as a hard disk device, an optical disk device, a flash memory or other similar solid state memory device, or an array of devices, including devices in a storage area network or other configurations. Instructions can be stored in an information carrier.
- the instructions when executed by one or more processing devices (for example, processor 2602), perform one or more methods, such as those described above.
- the instructions can also be stored by one or more storage devices such as computer- or machine -readable mediums (for example, the memory 2604, the storage device 2606, or memory on the processor 2602).
- the low-speed expansion port 2614 which may include various communication ports (e.g., USB, Bluetooth®, Ethernet, wireless Ethernet) may be coupled to one or more input/output devices, such as a keyboard, a pointing device, a scanner, or a networking device such as a switch or router, e.g., through a network adapter.
- input/output devices such as a keyboard, a pointing device, a scanner, or a networking device such as a switch or router, e.g., through a network adapter.
- the computing device 2600 may be implemented in a number of different forms, as shown in the figure. For example, it may be implemented as a standard server 2620, or multiple times in a group of such servers. In addition, it may be implemented in a personal computer such as a laptop computer 2622. It may also be implemented as part of a rack server system 2624. Alternatively, components from the computing device 2600 may be combined with other components in a mobile device (not shown), such as a mobile computing device 2650. Each of such devices may contain one or more of the computing device 2600 and the mobile computing device 2650, and an entire system may be made up of multiple computing devices communicating with each other.
- the processor 2652 can execute instructions within the mobile computing device 2650, including instructions stored in the memory 2664.
- the processor 2652 may be implemented as a chipset of chips that include separate and multiple analog and digital processors.
- the processor 2652 may provide, for example, for coordination of the other components of the mobile computing device 2650, such as control of user interfaces, applications run by the mobile computing device 2650, and wireless communication by the mobile computing device 2650.
- the processor 2652 may communicate with a user through a control interface 2658 and a display interface 2656 coupled to the display 2654.
- the display 2654 may be, for example, a TFT (Thin-Film-Transistor Liquid Crystal Display) display or an OLED (Organic Light Emitting Diode) display, or other appropriate display technology.
- the display interface 2656 may comprise appropriate circuitry for driving the display 2654 to present graphical and other information to a user.
- the control interface 2658 may receive commands from a user and convert them for submission to the processor 2652.
- an external interface 2662 may provide communication with the processor 2652, so as to enable near area communication of the mobile computing device 2650 with other devices.
- the external interface 2662 may provide, for example, for wired communication in some implementations, or for wireless communication in other implementations, and multiple interfaces may also be used.
- the memory 2664 stores information within the mobile computing device 2650.
- the memory 2664 can be implemented as one or more of a computer-readable medium or media, a volatile memory unit or units, or a non-volatile memory unit or units.
- An expansion memory 2674 may also be provided and connected to the mobile computing device 2650 through an expansion interface 2672, which may include, for example, a SIMM (Single In Line Memory Module) card interface.
- SIMM Single In Line Memory Module
- the expansion memory 2674 may provide extra storage space for the mobile computing device 2650, or may also store applications or other information for the mobile computing device 2650.
- the expansion memory 2674 may include instructions to carry out or supplement the processes described above, and may include secure information also.
- the expansion memory 2674 may be provided as a security module for the mobile computing device 2650, and may be programmed with instructions that permit secure use of the mobile computing device 2650.
- secure applications may be provided via the SIMM cards, along with additional information, such as placing identifying information on the SIMM card in a non-hackable manner.
- the memory may include, for example, flash memory and/or NVRAM memory (non-volatile random access memory), as discussed below.
- instructions are stored in an information carrier and, when executed by one or more processing devices (for example, processor 2652), perform one or more methods, such as those described above.
- the instructions can also be stored by one or more storage devices, such as one or more computer- or machine-readable mediums (for example, the memory 2664, the expansion memory 2674, or memory on the processor 2652).
- the instructions can be received in a propagated signal, for example, over the transceiver 2668 or the external interface 2662.
- the mobile computing device 2650 may communicate wirelessly through the communication interface 2666, which may include digital signal processing circuitry where necessary.
- the communication interface 2666 may provide for communications under various modes or protocols, such as GSM voice calls (Global System for Mobile communications), SMS (Short Message Service), EMS (Enhanced Messaging Service), or MMS messaging (Multimedia Messaging Service), CDMA (code division multiple access), TDMA (time division multiple access), PDC (Personal Digital Cellular), WCDMA (Wideband Code Division Multiple Access), CDMA2000, or GPRS (General Packet Radio Service), among others.
- GSM voice calls Global System for Mobile communications
- SMS Short Message Service
- EMS Enhanced Messaging Service
- MMS messaging Multimedia Messaging Service
- CDMA code division multiple access
- TDMA time division multiple access
- PDC Personal Digital Cellular
- WCDMA Wideband Code Division Multiple Access
- CDMA2000 Code Division Multiple Access
- GPRS General Packet Radio Service
- GPS Global Positioning System
- short-range communication may occur, such as using a Bluetooth®, Wi-FiTM, or other such transceiver (not shown).
- a GPS (Global Positioning System) receiver module 2670 may provide additional navigation- and location- related wireless data to the mobile computing device 2650, which may be used as appropriate by applications running on the mobile computing device 2650.
- the mobile computing device 2650 may also communicate audibly using an audio codec 2660, which may receive spoken information from a user and convert it to usable digital information.
- the audio codec 2660 may likewise generate audible sound for a user, such as through a speaker, e.g., in a handset of the mobile computing device 2650.
- Such sound may include sound from voice telephone calls, may include recorded sound (e.g., voice messages, music files, etc.) and may also include sound generated by applications operating on the mobile computing device 2650.
- the mobile computing device 2650 may be implemented in a number of different forms, as shown in the figure. For example, it may be implemented as a cellular telephone 2680. It may also be implemented as part of a smart-phone 2682, personal digital assistant, or other similar mobile device.
- Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, specially designed ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof.
- ASICs application specific integrated circuits
- These various implementations can include implementation in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.
- machine-readable medium and computer-readable medium refer to any computer program product, apparatus and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine- readable medium that receives machine instructions as a machine-readable signal.
- machine-readable signal refers to any signal used to provide machine instructions and/or data to a programmable processor.
- the systems and techniques described here can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user can provide input to the computer.
- a display device e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
- a keyboard and a pointing device e.g., a mouse or a trackball
- Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.
- the systems and techniques described here can be implemented in a computing system that includes a back end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front end component (e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back end, middleware, or front end components.
- the components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network (LAN), a wide area network (WAN), and the Internet.
- LAN local area network
- WAN wide area network
- the Internet the global information network
- the computing system can include clients and servers.
- a client and server are generally remote from each other and typically interact through a communication network.
- the relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
- modules described herein can be separated, combined, or incorporated into single or combined modules. Any modules depicted in the figures are not intended to limit the systems described herein to the software architectures shown therein.
- a system for management of an operating room comprising: one or more cameras for receiving visual input from the operating room; one or more displays (e.g., screen, holographic display, or other display) (e.g., a display of an augmented reality (AR), virtual reality (VR), and/or mixed reality (MR) device) for presenting visual output data [e.g., via a graphical user interface and/or presented as an overlay (e.g., via an AR/VR/MR device)] to one or more users; a processor of a computing device for receiving and processing input data (e.g., the visual input received from the one or more cameras, and/or the additional input received from the one or more sensors, and/or the audio input received from the one or more audio I/O devices) from the operating room; and a memory having instructions stored thereon, wherein the instructions, when executed by the processor, cause the processor to use at least a portion of the input data received from the operating room to produce visual and/or audio output data via one or more machine learning modules, said
- the one or more sensors are based on one or more of the following: Wi-Fi, Bluetooth, ZigBee, radio frequency identification (RFID), ultrawide band (UWB), inertial measurement unit (IMU), visible light communication (VLC), infrared (IR), ultrasonic, geomagnetic, light detection and ranging (LiDAR), and computer vision.
- RFID radio frequency identification
- UWB ultrawide band
- IMU inertial measurement unit
- VLC visible light communication
- IR infrared
- LiDAR light detection and ranging
- any one of embodiments 1-3 comprising one or more audio input/output (I/O) devices (e.g., microphones and/or speakers) for receiving audio input from the operating room and/or for producing audio output to one or more users, wherein the at least a portion of the input data received from the operating room comprises at least a portion of the additional input and/or wherein the visual and/or audio output data comprises the audio output from the one or more audio I/O devices.
- I/O audio input/output
- any one of embodiments 1-4 further comprising a projector for projecting directionally-controlled light (e.g., laser light, a spotlight, or other light) onto one or more objects in the operating room, wherein the instructions, when executed by the processor, cause the processor to use at least a portion of the input data received from the operating room to spatially locate said one or more objects and cause the projector to illuminate said one or more objects as a visual aid for the one or more users in the operating room.
- directionally-controlled light e.g., laser light, a spotlight, or other light
- a system for management of an operating room comprising: one or more input devices, the one or more input devices comprising (i) one or more cameras for receiving visual input from the operating room, (ii) one or more sensors for receiving additional input from the operating room, and/or (iii) one or more audio input/output (I/O) devices for receiving audio input from the operating room (e.g., and also for producing audio output to one or more users); one or more output devices, the one or more output devices comprising: one or more displays for presenting visual output data, (ii) one or more projectors for projecting directionally-controlled light onto one or more objects in the operating room, and/or (iii) one or more audio input/output (I/O) devices for producing audio output (e.g., and optionally also for receiving audio input from the operating room) to one or more users; a processor of a computing device for receiving and processing input data (e.g., the visual input received from the one or more cameras, and/or the additional input received from the
- the one or more functions comprise person identification (e.g., identify who is who in the operating room) [e.g., differentiating between persons based on function (e.g., scrub nurse from surgeon) (e.g., using semantic segmentation)] .
- person identification e.g., identify who is who in the operating room
- function e.g., scrub nurse from surgeon
- semantic segmentation e.g., using semantic segmentation
- stage and/or task identification e.g., identify a current stage of a surgical procedure and/or identify what stage will follow or will likely follow the current stage.
- any one of embodiments 6-17, wherein the one or more functions comprise spatial localization (e.g., spatially localize person(s) and/or equipment in the operating room).
- context segmentation e.g., identify and/or tag people and/or equipment in the operating room, e.g., identify such as sterile/non-sterile and/or mobile/fixed
- the one or more functions comprise specific training generation (e.g., generate training specific to a particular surgeon and/or procedure(s) customized for one or more particular users, e.g., a scrub, nurse, sales representative, and/or assistant).
- the one or more functions comprise personnel assessment (e.g., assess skills, capabilities, history, and/or style associated with one or more particular system users).
- the one or more functions comprises instrument identification
- the output data rendered for presentation to at least one of the one or more users comprises a graphical overlay (e.g., bounding box) rendered in relation to an identified instrument (e.g., as viewed via an AR or VR display) to graphically annotate (e.g., outline, highlight, denote with text, and/or graphically present a determined probability said identification is accurate) the identified instrument to the at least one of the one or more users via one or more of the one or more displays.
- a graphical overlay e.g., bounding box
- an identified instrument e.g., as viewed via an AR or VR display
- graphically annotate e.g., outline, highlight, denote with text, and/or graphically present a determined probability said identification is accurate
- the one or more functions comprises stage and/or task identification
- the processor determines (i) an identification of a current stage of a surgical procedure and/or (ii) an identification of what stage will follow or will likely follow the current stage
- the output data rendered for presentation to at least one of the one or more users comprises illumination by the projector of one or more instruments that are likely to be used in the current stage and/or the next stage of the surgical procedure.
- any one of embodiments 1-28 comprising one or more sensors and/or one or more cameras operable to detect one or more fiducials in the operating room [e.g., comprising one or more RFID tags, one or more visible codes (e.g., QR codes or bar codes), one or more color-coding of clothes (e.g. surgical/scrub cups), one or more visible tags, one or more optical tracking markers (e.g. reflective spheres), and/or one or more electro-magnetic fiducials) disposed on one or more persons.
- one or more fiducials in the operating room e.g., comprising one or more RFID tags, one or more visible codes (e.g., QR codes or bar codes), one or more color-coding of clothes (e.g. surgical/scrub cups), one or more visible tags, one or more optical tracking markers (e.g. reflective spheres), and/or one or more electro-magnetic fiducials) disposed on one or more persons.
- the instructions when executed by the processor, cause the processor to produce visual and/or audio output data identifying the one or more persons in the operating room via the one or more machine learning modules based on data from the one or more sensors and/or one or more cameras corresponding to the one or more fiducials (e.g., wherein the one or more persons are visually obfuscated [e.g., have an obfuscated face (e.g., obfuscated by a mask) and/or a visually obfuscated body (e.g., obfuscated by a gown or drape)] (e.g., wherein the one or more persons do not have a direct line of sight with any camera or sensor).
- the instructions when executed by the processor, cause the processor to produce visual and/or audio output data identifying the one or more persons in the operating room via the one or more machine learning modules based on data from the one or more sensors and/or one or more cameras corresponding to the one or more fiducial
- the one or more machine learning modules are operable to learn (i) instrument performance over time (e.g., in conjunction with instrument-relevant events), (ii) changes in instrument visual appearance (e.g., de-colorization, parts missing or deformed), and/or (iii) usage tracking and to provide the audio and/or visual output data based on the learning.
- the safety event comprises a contamination risk (e.g., for a patient and/or one or more users) (e.g., an audio and/or visual warning).
- a contamination risk e.g., for a patient and/or one or more users
- an audio and/or visual warning e.g., an audio and/or visual warning
- any one of embodiments 1-54 wherein the one or more machine learning modules are operable to perform contextual segmentation of sterile and non-sterile elements (e.g., surfaces and clothes) (e.g., based on color) [e.g., based on relative position and/or history (e.g., instruments taken from a sterilization tray are sterile)].
- sterile and non-sterile elements e.g., surfaces and clothes
- color e.g., based on relative position and/or history (e.g., instruments taken from a sterilization tray are sterile)].
- a method for management of an operating room comprising: receiving, by a processor of a computing device, input data from the operating room and producing visual and/or audio output data via one or more machine learning modules, said output data rendered for presentation to a user on a display, wherein the input data comprises one or more of (i)-(iii) as follows:
- sensors selected from the group consisting of the following: Wi-Fi, Bluetooth, ZigBee, radio frequency identification (RFID), ultra-wide band (UWB), inertial measurement unit (IMU), visible light communication (VLC), infrared (IR), ultrasonic, geomagnetic, light detection and ranging (LiDAR), and computer vision; and
- a method for management of an operating room comprising: receiving, by a processor of a computing device, input data from the operating room and producing visual and/or audio output data via one or more machine learning modules (e.g., said output data rendered for presentation to a user on a display), wherein the input data comprises one or more of (i)-(iii) as follows:
- embodiment 66 or embodiment 67 comprising producing visual and/or audio output data via the one or more machine learning modules in order to perform one or more functions selected from the group consisting of: person identification (e.g., identify who is who in the operating room); stage and/or task identification (e.g., identify a current stage of a surgical procedure and/or identify what stage will follow or will likely follow the current stage); predictive analytics (e.g., based on past experiences, predict a likely next action and/or a next goal); instrument identification (e.g., identify which instruments are used and/or prepared and/or handled); speech recognition and synthesis (e.g., interpret conversation and produce speech response); human emotion recognition (e.g., recognize emotions of one or more users in the operating room); event detection (e.g., detect selected events in the operating room); spatial localization (e.g., spatially localize person(s) and/or equipment in the operating room); context segmentation (e.g., identify and/or tag people and/or equipment in the group consisting of: person
- the one or more functions comprises instrument identification
- the output data comprises a graphical overlay (e.g., bounding box) rendered in relation to an identified instrument (e.g., as viewed via an AR, VR, or MR display) to graphically annotate (e.g., outline, highlight, denote with text, and/or graphically present a determined probability said identification is accurate) the identified instrument to the at least one of the one or more users via one or more displays.
- a graphical overlay e.g., bounding box
- an identified instrument e.g., as viewed via an AR, VR, or MR display
- graphically annotate e.g., outline, highlight, denote with text, and/or graphically present a determined probability said identification is accurate
- the one or more functions comprises context segmentation
- the output data comprises a graphical warning presented via a display or an audible warning presented via an audio output device, said warning indicating a potential loss of sterility (e.g., due to touching of a non-sterile instrument by an individual).
- the one or more functions comprises (i) identifying, by the processor of the computing device, a current stage of a surgical procedure and/or (ii) identifying, by the processor of the computing device, what stage will follow or will likely follow the current stage, and wherein the output data comprises illumination by the projector of one or more instruments that are likely to be used in the current stage and/or the next stage of the surgical procedure.
- stage and/or task identification e.g., identify a current stage of a surgical procedure and/or identify what stage will follow or will likely follow the current stage.
- any one of embodiments 72-82, wherein the one or more functions comprise spatial localization (e.g., spatially localize person(s) and/or equipment in the operating room).
- the one or more functions comprise context segmentation (e.g., identify and/or tag people and/or equipment in the operating room, e.g., identify such as sterile/non-sterile and/or mobile/fixed);
- specific training generation e.g., generate training specific to a particular surgeon and/or procedure(s) customized for one or more particular users, e.g., a scrub, nurse, sales representative, and/or assistant.
- any one of embodiments 72-88 wherein the one or more functions comprises instrument identification, and wherein the output data rendered for presentation to at least one of the one or more users comprises a graphical overlay (e.g., bounding box) rendered in relation to an identified instrument (e.g., as viewed via an AR or VR display) to graphically annotate (e.g., outline, highlight, denote with text, and/or graphically present a determined probability said identification is accurate) the identified instrument to the at least one of the one or more users via one or more of the one or more displays.
- a graphical overlay e.g., bounding box
- an identified instrument e.g., as viewed via an AR or VR display
- graphically annotate e.g., outline, highlight, denote with text, and/or graphically present a determined probability said identification is accurate
- the one or more functions comprises stage and/or task identification
- the stage and/or task identification comprises (i) an identification of a current stage of a surgical procedure and/or (ii) an identification of what stage will follow or will likely follow the current stage
- the output data rendered for presentation to at least one of the one or more users comprises illumination by the projector of one or more instruments that are likely to be used in the current stage and/or the next stage of the surgical procedure.
- any one of embodiments 66-91 comprising obtaining thermal imaging input from one or more thermal imaging cameras in the operating room and using at least a portion of the thermal imaging input to produce visual and/or audio output data via the one or more machine learning modules identifying one or more persons in the operating room (e.g., wherein the one or more persons are obfuscated [e.g., have an obfuscated face (e.g., obfuscated by a mask) and/or an obfuscated body (e.g., obfuscated by a gown or drape)].
- the one or more persons are obfuscated [e.g., have an obfuscated face (e.g., obfuscated by a mask) and/or an obfuscated body (e.g., obfuscated by a gown or drape)].
- any one of embodiments 1-28 comprising detecting one or more fiducials in the operating room with one or more sensors and/or one or more cameras operable to [e.g., comprising one or more RFID tags, one or more visible codes (e.g., QR codes or bar codes), one or more color-coding of clothes (e.g. surgical/scrub cups), one or more visible tags, one or more optical tracking markers (e.g. reflective spheres), and/or one or more electromagnetic fiducials) disposed on one or more persons.
- RFID tags comprising one or more RFID tags, one or more visible codes (e.g., QR codes or bar codes), one or more color-coding of clothes (e.g. surgical/scrub cups), one or more visible tags, one or more optical tracking markers (e.g. reflective spheres), and/or one or more electromagnetic fiducials) disposed on one or more persons.
- visible codes e.g., QR codes or bar codes
- color-coding of clothes e.g. surgical/scrub cups
- embodiment 93 or embodiment 94 comprising determining a position of the one or more persons in the operating room (e.g., via the one or more machine learning modules) based on data received from the one or more sensors and/or one or more cameras corresponding to the one or more fiducials (e.g., wherein the one or more persons do not have a direct line of sight with any camera or sensor) (e.g., using skeletal approximation).
- any one of embodiments 66-95 comprising determining identity and/or position of one or more persons in the operating room (e.g., via the one or more machine learning modules) using a combination of (e.g., in sequence) (e.g., combined using a Kalman fdter) facial recognition, 3D mesh based identification and/or tracking, fiducial-based identification and/or positioning, contextual identification and/or positioning (e.g., based on a person’s spatial location, behavior, and/or tasks), thermal imaging, one or more personal characteristics (e.g., voice, gait, movement, characteristic elements), and wireless identification and/or positioning (e.g., Bluetooth location or other near-field communication) (e.g., (e.g., using skeletal approximation).
- a combination of e.g., in sequence
- 3D mesh based identification and/or tracking e.g., combined using a Kalman fdter
- fiducial-based identification and/or positioning e.g.,
- any one of embodiments 66-96 comprising determining identity and/or position of one or more persons in the operating room using a combination of audio and visual identification and/or positioning (e.g., based on data received from one or more cameras and one or more audio input/output devices (e.g., microphones and/or speakers)) (e.g., using skeletal approximation).
- a combination of audio and visual identification and/or positioning e.g., based on data received from one or more cameras and one or more audio input/output devices (e.g., microphones and/or speakers)
- a combination of audio and visual identification and/or positioning e.g., based on data received from one or more cameras and one or more audio input/output devices (e.g., microphones and/or speakers)
- any one of embodiments 66-97 comprising producing visual and/or audio output data for stage and/or task identification (e.g., identify a current stage of a surgical procedure and/or identify what stage will follow or will likely follow the current stage) via the one or more machine learning modules based on instrument identification for one or more instruments in the operating room performed with the one or more machine learning modules.
- stage and/or task identification e.g., identify a current stage of a surgical procedure and/or identify what stage will follow or will likely follow the current stage
- any one of embodiments 66-98 comprising identifying a current stage and/or task of a surgery via the one or more machine learning modules based on persons present and/or their positions, instruments used, spoken language, noise signature, history of current case, and/or comparison to other cases reflected in the input data (e.g., as determined with the one or more machine learning modules) (e.g., and produce corresponding visual and/or audio output data).
- the input data e.g., as determined with the one or more machine learning modules
- the one or more machine learning modules have been trained using real-time data coming from surgery in the operating room (e.g., surgery stage and task data, identified instrumentation, person identification and/or localization data).
- the visual and/or output data correspond to (e.g., comprise) a VR/AR/MR overlay [e.g., wherein the method comprises rendering (e.g., and displaying) the VR/AR/MR overlay to a user (e.g., on a VR/AR/MR headset)] .
- the method of embodiment 103 comprising receiving input (e.g., audio input) from a user and resolving the low confidence level based on the received input (e.g., an audio indication of what the instrument is).
- input e.g., audio input
- the low confidence level based on the received input (e.g., an audio indication of what the instrument is).
- any one of embodiments 66-104 comprising training the one or more machine learning modules to learn (i) instrument performance over time (e.g., in conjunction with instrument-relevant events), (ii) changes in instrument visual appearance (e.g., decolorization, parts missing or deformed), and/or (iii) usage tracking and to provide the audio and/or visual output data based on the learning.
- the safety event comprises a contamination risk (e.g., for a patient and/or one or more users) (e.g., an audio and/or visual warning).
- a contamination risk e.g., for a patient and/or one or more users
- an audio and/or visual warning e.g., an audio and/or visual warning
- embodiment 111 or embodiment 112 comprising determining, via the one or more machine learning modules, using the input data that there has been incorrect room preparation (e.g. missing sterile drape(s)), incorrect instrumentation handling (e.g. assembly of instruments, like powered devices, where non-sterile battery needs to be inserted into sterile cover), incorrect aseptic technique when putting on sterile clothes, filling-up liquids, accidental touching of non-sterile surfaces by sterile equipment, passing too close to sterile equipment, or one or more instruments falling on the ground. 113.
- incorrect room preparation e.g. missing sterile drape(s)
- incorrect instrumentation handling e.g. assembly of instruments, like powered devices, where non-sterile battery needs to be inserted into sterile cover
- incorrect aseptic technique when putting on sterile clothes, filling-up liquids, accidental touching of non-sterile surfaces by sterile equipment, passing too close to sterile equipment, or one or more instruments falling on the ground.
- any one of embodiments 66-112, wherein the one or more machine learning modules have been trained e.g., wherein the method comprises training the one or more machine learning modules
- the method comprises training the one or more machine learning modules
- visual input e.g., audio-derived annotations (e.g., labels)]
- audio-derived annotations e.g., labels
- stage and/or task identification e.g., indicative of a safety event (e.g., a user auditorily indicating “sterility loss”)]
- a safety event e.g., a user auditorily indicating “sterility loss”
- the method of any one of embodiments 66-123 comprising outputting via the one or more machine learning modules, based on the input data, information on effectiveness, effective OR time vs. planned time, preparation time, instrumentation waiting time, ambience/mood in the OR, event statistics (e.g., detected events and/or number of them), surgery stages statistics, personnel statistics, trends, costs, or a combination thereof in order to analyze personal behavior from the operating room.
- event statistics e.g., detected events and/or number of them
- surgery stages statistics e.g., personnel statistics, trends, costs, or a combination thereof in order to analyze personal behavior from the operating room.
- the method of any one of embodiments 66-124 comprising outputting via the one or more machine learning modules information on one or more direct patient impact events (e.g., errors with manipulating surgical tracking and navigation equipment and/or hitting patient measurement markers attached to anatomy) based on the input data.
- any one of embodiments 66-127 comprising determining, via the one or more machine learning modules, whether needed equipment is present in the operating room based on a user request (e.g., an auditory request) for confirmation and the input data and the audio and/or visual output data confirm presence or absence of the equipment.
- a user request e.g., an auditory request
Landscapes
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
L'invention concerne des systèmes et des procédés qui aident une équipe de salle opératoire (OR) pendant toutes les étapes d'une chirurgie, facilitent la gestion de la salle opératoire, et/ou permettent des formations, des analyses et une intelligence propres aux chirurgiens ou à la chirurgie pour les administrateurs et fournissent des algorithmes d'optimisation afin de réduire les coûts liés à la salle opératoire. Dans certains modes de réalisation, les systèmes et les procédés décrits dans la présente invention comprennent un module matériel de salle opératoire (OR) avec des caméras et des capteurs, des stations de traitement, un cloud, des terminaux et un logiciel d'apprentissage automatique (ML) associé.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202363547112P | 2023-11-02 | 2023-11-02 | |
| US63/547,112 | 2023-11-02 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025093970A1 true WO2025093970A1 (fr) | 2025-05-08 |
Family
ID=93430349
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/IB2024/059933 Pending WO2025093970A1 (fr) | 2023-11-02 | 2024-10-10 | Systèmes et procédés de gestion de salle opératoire |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025093970A1 (fr) |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10705604B2 (en) | 2018-05-22 | 2020-07-07 | Htc Corporation | Eye tracking apparatus and light source control method thereof |
| US20200315734A1 (en) * | 2019-04-04 | 2020-10-08 | The Board Of Regents Of The University Of Oklahoma | Surgical Enhanced Visualization System and Method of Use |
| US10990170B2 (en) | 2018-03-15 | 2021-04-27 | Htc Corporation | Eye tracking method, electronic device, and non-transitory computer readable storage medium |
| US20220117669A1 (en) * | 2019-02-05 | 2022-04-21 | Smith & Nephew, Inc. | Augmented reality in arthroplasty surgery |
| WO2022219491A1 (fr) * | 2021-04-14 | 2022-10-20 | Cilag Gmbh International | Système et procédé de suivi d'une partie de l'utilisateur en tant que substitut pour instrument non surveillé |
| US20220331047A1 (en) * | 2021-04-14 | 2022-10-20 | Cilag Gmbh International | Method for intraoperative display for surgical systems |
| WO2023002385A1 (fr) * | 2021-07-22 | 2023-01-26 | Cilag Gmbh International | Identification de plateforme et suivi d'objets et de personnel dans la ou les données de recouvrement personnalisées selon le besoin de l'utilisateur |
-
2024
- 2024-10-10 WO PCT/IB2024/059933 patent/WO2025093970A1/fr active Pending
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10990170B2 (en) | 2018-03-15 | 2021-04-27 | Htc Corporation | Eye tracking method, electronic device, and non-transitory computer readable storage medium |
| US10705604B2 (en) | 2018-05-22 | 2020-07-07 | Htc Corporation | Eye tracking apparatus and light source control method thereof |
| US20220117669A1 (en) * | 2019-02-05 | 2022-04-21 | Smith & Nephew, Inc. | Augmented reality in arthroplasty surgery |
| US20200315734A1 (en) * | 2019-04-04 | 2020-10-08 | The Board Of Regents Of The University Of Oklahoma | Surgical Enhanced Visualization System and Method of Use |
| WO2022219491A1 (fr) * | 2021-04-14 | 2022-10-20 | Cilag Gmbh International | Système et procédé de suivi d'une partie de l'utilisateur en tant que substitut pour instrument non surveillé |
| US20220331047A1 (en) * | 2021-04-14 | 2022-10-20 | Cilag Gmbh International | Method for intraoperative display for surgical systems |
| WO2023002385A1 (fr) * | 2021-07-22 | 2023-01-26 | Cilag Gmbh International | Identification de plateforme et suivi d'objets et de personnel dans la ou les données de recouvrement personnalisées selon le besoin de l'utilisateur |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12380986B2 (en) | Virtual guidance for orthopedic surgical procedures | |
| JP6949128B2 (ja) | システム | |
| US20210313050A1 (en) | Systems and methods for assigning surgical teams to prospective surgical procedures | |
| US20210313051A1 (en) | Time and location-based linking of captured medical information with medical records | |
| US20200126661A1 (en) | Augmented reality for predictive workflow in an operating room | |
| CN116230153A (zh) | 医疗助理 | |
| US20230093342A1 (en) | Method and system for facilitating remote presentation or interaction | |
| WO2025093970A1 (fr) | Systèmes et procédés de gestion de salle opératoire |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24801983 Country of ref document: EP Kind code of ref document: A1 |