WO2025180921A1

WO2025180921A1 - A system for providing guidance for selection of a surgical instrument, a method and a computer program product

Info

Publication number: WO2025180921A1
Application number: PCT/EP2025/054496
Authority: WO
Inventors: Paul Springer; Zoltan Facius
Original assignee: Sony Europe Bv; Sony Group Corp
Current assignee: Sony Europe Bv; Sony Group Corp
Priority date: 2024-02-27
Filing date: 2025-02-19
Publication date: 2025-09-04
Anticipated expiration: 2026-08-27

Abstract

A system for providing guidance for selection of a surgical instrument is provided, the system comprising: an image capture device configured to: acquire image data of a scene; analyse the image data to identify one or more instruments located in the scene; and transmit metadata, generated in accordance with the identified instruments, to a server; and the server, wherein the server is configured to: process the metadata to generate guidance in accordance with the surgical instruments which have been identified and a standard protocol; and control a display device to display the guidance to a user, the guidance comprising an indication of a surgical instrument.

Description

A SYSTEM FOR PROVIDING GUIDANCE FOR SELECTION OF A SURGICAL INSTRUMENT, A METHOD AND A COMPUTER PROGRAM PRODUCT

BACKGROUND:

Field of the Disclosure

The present disclosure relates to a system for providing guidance for selection of a surgical instrument, a method and a computer program product

Description of the Related Art

The "background” description provided herein is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent it is described in the background section, as well as aspects of the description which may not otherwise qualify as prior art at the time of filing, are neither expressly or impliedly admitted as prior art against the present invention.

Surgical procedures are complex procedures which require use of a number of different surgical instruments. The type of surgical instrument which is required at any given time may vary depending on the type of the surgical procedure being performed and the stage of the procedure.

Reliable preparation and selection of surgical instruments for a surgical procedure is of significant importance in improving safety and efficiency of a surgical procedure.

However, given the complexity of surgical procedures, it can be difficult to prepare and select surgical instruments. Inaccurate or incomplete preparation and selection of surgical instruments can lead to missing instruments during operation and/or result in an incorrect instrument being passed to a surgeon during the operation.

In addition, restrictions on usage of data relating to the perioperative process can further exacerbate the problem of reliable preparation and selection of surgical instruments.

Therefore, a more reliable way of preparing and selecting surgical instruments is desired.

It is an aim of the present disclosure to address these issues.

SUMMARY:

Embodiments of the present disclosure are defined by the independent claims. Further aspects of the disclosure are defined by the dependent claims. In accordance with embodiments of the disclosure a more reliable way of preparing and selecting surgical instruments is provided, while data relating to the perioperative process is utilized in a manner complying with restrictions on usage. Through more reliable preparation and selection of surgical instruments, safety and efficiency of a surgical procedure can be improved. This may further improve patient outcomes from surgical procedures.

The present disclosure is not particularly limited to these advantageous technical effects. Further technical effects will become apparent to the skilled person when reading the disclosure.

BRIEF DESCRIPTION OF THE DRAWINGS:

A more complete appreciation of the disclosure and many of the attendant advantages thereof will be readily obtained as the same becomes better understood by reference to the following detailed description when considered in connection with the accompanying drawings, wherein:

Figure 1 illustrates an example of a schematic configuration of a surgical system to which embodiments of the disclosure can be applied;

Figure 2 illustrates an example system in accordance with embodiments of the disclosure;

Figure 3 illustrates an example configuration of components of an example system in accordance with embodiments of the disclosure;

Figure 4 illustrates an example configuration of imaging devices in accordance with embodiments of the disclosure;

Figure 5 illustrates an example implementation of a system in accordance with embodiments of the disclosure;

Figure 6 illustrates an example provision of guidance in accordance with embodiments of the disclosure;

Figure 7 illustrates a method in accordance with embodiments of the disclosure.

DESCRIPTION OF THE EMBODIMENTS:

The foregoing paragraphs have been provided by way of general introduction, and are not intended to limit the scope of the following claims. The described embodiments, together with further advantages, will be best understood by reference to the following detailed description taken in conjunction with the accompanying drawings (wherein like reference numerals designate identical or corresponding parts throughout the several views). Turning now to Figure 1 of the present disclosure, a block diagram illustrating an example of a schematic configuration of a surgical system 5000 to which the technology according to the present disclosure can be applied is shown. Figure 1 illustrates a state where an operator (doctor) 5067 is conducting surgery to a patient 5071 on a patient bed 5069.

In this specific example, the surgery being performed on the patient 5071 is an endoscopic surgical procedure using the endoscopic surgery system 5000. However, the present disclosure is not particularly limited in this regard. More generally, embodiments of the present disclosure can be applied to any type of surgical procedure. For example, embodiments of the disclosure may be applied to any type of surgical procedure including, but not limited to, general surgery, dental surgery, appendectomy, cataract surgery, heart surgery, neurosurgery, or the like. Therefore, the present disclosure is not particularly limited to this example situation described with reference to Figure 1.

As illustrated, the endoscopic surgery system 5000 is constituted by an endoscope 5001, other surgical tools 5017, and a support arm device 5027 supporting the endoscope 5001, and a cart 5037 on which various devices for endoscopic surgery are mounted.

In the endoscopic surgery, the abdominal wall is punctured with a plurality of tubular holeopening instruments called trocars 5025a to 5025d instead of cutting the abdominal wall to open the abdomen. Then, a lens barrel 5003 of the endoscope 5001 and the other surgical tools 5017 are inserted into a body cavity of the patient 5071 through the trocars 5025a to 5025d. In the illustrated example, as the other surgical tools 5017, an insufflation tube 5019, an energy treatment tool 5021, and forceps 5023 are inserted into the body cavity of the patient 5071. Furthermore, the energy treatment tool 5021 is a treatment tool that performs incision and peeling of a tissue, sealing of a blood vessel, or the like using high-frequency current or ultrasonic vibration. However, the illustrated surgical tool 5017 is merely an example, and various surgical tools generally used in endoscopic surgery, for example, tweezers, a retractor, and the like may be used as the surgical tool 5017.

An image of an operation site in the body cavity of the patient 5071 captured by the endoscope 5001 is displayed on a display device 5041. The operator 5067 performs treatment, for example, to excise an affected site using the energy treatment tool 5021 or the forceps 5023 while viewing the image of the operation site displayed by the display device 5041 in real time. Note that the insufflation tube 5019, the energy treatment tool 5021, and the forceps 5023 are supported by the operator 5067, an assistant, or the like during surgery although not illustrated.

Embodiments of the present disclosure may be applied to, and used with, an endoscopic surgical system 5000 as described with reference to Figure 1 of the present disclosure. As noted during the Background, surgical procedures (such as that described with reference to Figure 1 of the present disclosure) require the use of a number of different surgical instruments. For example, during the endoscopic surgery described with reference to Figure 1, the operator (a doctor) may require use of an instrument such as tweezers, retractors, an energy treatment tool or the like.

The type of surgical instrument which is required at any given time may vary depending on the type of the surgical procedure being performed and the stage of the procedure. Therefore, the present disclosure is not particularly limited to any specific example of a surgical instrument More generally, the surgical instrument is any device which may be required during a surgical procedure for performing a specific action and/or for obtaining a desired effect

Reliable preparation and selection of surgical instruments for a surgical procedure is of significant importance in improving safety and efficiency of a surgical procedure. This can reduce instances of a missing instrument during the surgical procedure and reduce instances of an incorrect instrument being passed to the operator (doctor) during the surgical procedure.

However, given the complexity of surgical procedures, it can be difficult to prepare and select surgical instruments.

Accordingly, embodiments of the present disclosure provide a system for providing guidance for selection of a surgical instrument, a method and a computer program product

Consider, now, Figure 2 of the present disclosure. Figure 2 illustrates an example system in accordance with embodiments of the disclosure.

At least a part of the system of Figure 2 may be provided within a surgical environment (a surgical theatre) in which an operator (a doctor) is performing (or will perform) a surgical procedure on a patient For example, the system of Figure 2 may be provided in a surgical environment where a surgical system 5000 as described with reference to Figure 1 of the present disclosure will be used.

The system illustrated in Figure 2 of the present disclosure is a system for providing guidance for selection of a surgical instrument The system for providing guidance for selection of a surgical instrument comprises an image capture device 100 and a server 200.

In examples, the system illustrated with reference to the example of Figure 2 of the present disclosure may include one or more additional elements alongside the image capture device 100 and the server 200. For example, the system may further include a display device for displaying guidance information generated by the system to the user. However, in other examples, such a display device may not necessarily be part of the system; in this case, the system may utilize a pre-existing display device within the surgical environment for displaying the guidance information which has been generated. In addition, while only a single image capture device is illustrated with reference to Figure 2 of the present disclosure, the system may comprise a plurality of image capture devices. This is described in more detail later.

According to embodiments of the disclosure, the image capture device 100 is configured to acquire image data of a scene; analyse the image data to identify one or more instruments located in the scene; and transmit metadata, generated in accordance with the identified instruments, to a server.

In the example of Figure 2, the image capture device 100 is provided within the surgical theatre. Therefore, the image capture device acquires image data within this surgical theatre. In particular, the image capture device 100 may be configured to acquire image data of a surface 300 (such as an instrument table, a surgical tray or the like) upon which surgical instruments for use during a surgical procedure are arranged. The surgical instruments may be arranged on the surface 300 ahead of the surgical procedure by an assistant Then, during the surgical procedure, the assistant may pass a surgical instrument from the surface 300 to the operator (the doctor (such as a surgeon)).

Hence, the image capture device 100 which is configured to acquire image data of the scene may be configured in order to acquire image data of the scene including image data of a surface on which surgical instruments are to be arranged. However, the image data of the scene may not be limited specifically only to the surface 300. For example, the image data may include image data of the entire surgical environment, with the surface 300 forming a part of this image data. In other examples, the image capture device may capture image data only of the surface 300.

While a single image capture device 100 is shown in the example of Figure 2 of the present disclosure, it will be appreciated that the present disclosure is not particularly limited in this regard. That is, as noted above, a plurality of image capture devices 100 may be provided as part of the system in accordance with embodiments of the disclosure. Nevertheless, this example is described with reference to the use of a single image capture device 100.

The image capture device 100 may be configured to acquire image data of any type or format For example, the image capture device may be configured to acquire high resolution image data, including image data of a 4K or 8K resolution, for example. Furthermore, the image data may comprise still image data (individual image frames) or moving image data (image frames forming part of a video). However, the type, format and resolution of the image data acquired by the image capture device is not particularly limited in accordance with embodiments of the disclosure.

According to embodiments of the disclosure, the image data acquired by the image capture device is processed locally at the image capture device (or within the surgical theatre side of the system). That is, the image capture device according to embodiments of the disclosure is provided with circuitry such that the image capture device can perform processing directly on the image data which is acquired.

The processing performed on the image data which has been acquired includes analysis of the image data to identify one or more instruments located in the scene (e.g. one or more instruments located on the surface 300). For example, this analysis may identify that tweezers and retractors have been provided on the surface 300.

The analysis of the image data which has been acquired is used, locally by the image capture device, in order to generate metadata for transmittance to the server. As an example, the metadata may include an indication of the type of surgical instruments which have been identified. Alternatively or in addition, the metadata may include an indication of the location of surgical instruments which have been identified. The location of the surgical instruments can include a position and orientation of the surgical instruments on the surface 300.

Once the metadata has been generated by the image capture device, the metadata is transmitted to the server 200.

The server 200 may not necessarily be located in the same environment as the image capture device 100. That is, while the image capture device 100 is located within the surgical theatre, the server 200 may be located outside the surgical theatre (this may include any suitable external location of the server 200). However, in examples, the server which performs the processing to generate the guidance information may also be located within the surgical environment

However, since the image capture device 200 is configured to process the image data which has been acquired locally in order to generate metadata which is then sent to the server, image data which has been acquired by the image capture device is not transmitted outside the surgical environment (even if the server 200 is located outside the surgical environment). Rather, only metadata which has been generated based on the analysis of the image data which has been acquired is transmitted to the server. Transmittance of the metadata to the server may be performed using any suitable wired or wireless communication interface. For example, the data may be communicated over a network such as the internet Alternatively, the data may be communicated over a local intranet

This is advantageous, as there may be one or more restrictions on usage of data relating to the perioperative process. For example, General Data Protection Rights (GDPR) may prevent personally identifiable data from being captured and stored outside of the surgical environment (e.g. cloud storage or cloud processing). Therefore, since the sensitive data from within the surgical environment is processed local at the image capture device within the surgical environment, the output provided to the server (the metadata) is not personally identifiable such that the system can be made compliant with GDPR and other restrictions on usage of data relating to the perioperative process.

Furthermore, because the image data acquired by the image capture device 100 is processed locally and does not need to be transmitted to the server, the network traffic and overheads consumed by the system can be significantly reduced. That is, the information transmitted to the server side comprises the metadata information generated on the basis of the image data which has been acquired, but does not include the image data which has been acquired. Moreover, once the metadata information has been generated and transmitted to the server, the image data acquired by the image capture device can be purged (thus negating any requirement for storage of the image data).

As noted, the image data may include high resolution image data including 4K and 8K images of the surgical scene. Accordingly, the impact of the image data on the network traffic and overheads consumed by the system can be suppressed, since the image data is neither transmitted to the server side nor stored locally at the image capture device.

In some examples, the server 200 maybe any suitable computer hardware or software which provides functionality which can be utilized for provision of guidance for selection of a surgical instrument The server 200 is configured to process the metadata to generate guidance in accordance with the surgical instruments which have been identified and a standard protocol; and control a display device to display the guidance to a user, the guidance comprising an indication of a surgical instrument

For example, once the metadata has been processed to generate the guidance, the server may control a display device within the surgical theatre (such as a display screen, a display unit, a projector, a speaker or the like) in order to provide guidance (including both audio and/or visual guidance) to a user within the surgical theatre. The guidance may include an indication of a surgical instrument, such as a surgical instrument which should be selected and passed to the operator (doctor). Alternatively or in addition, the guidance may include an indication of a surgical instrument which is missing from the surgical instruments which have been identified and which is likely to be required in the surgical procedure. Alternatively or in addition, the guidance may include validation of a surgical instrument which has been identified.

In accordance with embodiments of the disclosure, the server may generate the guidance in accordance with both the surgical instruments which have been identified (from the metadata received from the image capture device) and a standard protocol. The standard protocol may be a protocol defining one or more surgical instruments which are required. For example, the standard protocol may be related to the surgical procedure. A different surgical procedure (or indeed, a different stage of the same surgical procedure) may require a different surgical instrument

In examples, the standard protocol may be selected in accordance with an identification and/or indication of the surgical procedure and/or a current stage of the surgical procedure. This will be described in more detail later.

In examples, the server 200 may utilize external or cloud based 400 computation resources in order to generate the guidance, since the metadata which has been provided to the server is compliant with data restrictions. The server may communicate 400 with the external or cloud based computation resources using any suitable wired or wireless means.

As explained, once the server has generated the guidance information, the server is configured to control a display device to provide the guidance information to the user. This enables more accurate and reliable preparation and selection of surgical instruments.

Consider an example whereby an assistant arranges a number of surgical instruments on the surface 300 ahead of a surgical procedure. In this example, the surgical procedure is an endoscopic surgical procedure (such as that described with reference to Figure 1 of the present disclosure). Accordingly, the assistant may arrange surgical instruments including tweezers and retractors.

The image capture device 100 may acquire image data of the surface 300 and generate metadata indicating that tweezers and retractors have been placed on the surface 300. This metadata may then be transmitted to the server.

The server 200 may then identify that a surgical instrument is missing from set of surgical instruments that have been arranged on the surface 300. For example, the server 200 may identify that, based on the standard protocol information, an energy treatment tool should also be provided in addition to the tweezers and retractors. Guidance information may then be generated by the server and displayed to the assistant within the surgical theatre. The guidance information may instruct the assistant that an energy treatment tool should also be provided in addition to the tweezers and retractors which have already been arranged on the surface. For example, the guidance information may be provided to the assistant through control of a display device, by the server 200, such that an image of the missing energy treatment tool is displayed to the assistant, such that the assistant can immediately understand that the energy treatment tool should be provided in addition to the instruments which have already been prepared and placed on the surface 300.

The energy treatment tool may then be added to the surface by the assistant on the basis of the guidance information.

Accordingly, when - during the surgical procedure - the operator (a doctor) requests that the energy treatment tool is provided, the assistant will be able to provide the tool to the operator (since the energy treatment tool has been added to the surface 300 on the basis of the guidance information). Accordingly, inaccurate or incomplete preparation of surgical instruments can be avoided.

While this example has been described with reference to a situation of incomplete preparation of the surgical instruments ahead of the surgical procedure, it will be appreciated that the present disclosure is not particularly limited in this regard.

More generally, the system may be used in at least three different types of situations.

As a first example situation, the system may be used to validate a surgical instrument set prepared by an operation technical assistant against a standard protocol (defining the instruments generally required for a specific type of procedure). For example, the system may be used for cross-checking for completeness and precise arrangement of instruments on a surface (such as a table) by the assistant ahead of the surgical procedure.

As a second example situation, the system may be used to recommend a special or extended surgical instrument set This may be based on the standard protocol (defining a standard surgical instrument set) and also one or more auxiliary factors. The one or more auxiliary factors may be based on a pre-existing condition of the patient, a preference or requirement of the operator (doctor) who will be performing the operation or the like. In this way, the system may be used for cross-checking for completeness and precise arrangement of instruments on a surface (such as a table) by the assistant ahead of the surgical procedure and also for recommending a special or extended surgical instrument set (one or more additional instruments beyond the standard protocol). As a third example situation, the system may be used during the procedure to make instrument passing/transfer more secure and to improve the surgical team’s coordination. For example, the system may identify a next instrument which will be required by the operator (based, for example, on a current stage of the surgical procedure, a voice instruction from the operator, or the like). At this stage, the system may provide guidance information (e.g. by highlighting the relevant instrument) such that the assistant can easily identify the instrument which should be passed to the operator.

Accordingly, the system described with reference to Figure 2 of the present disclosure may be used in order to generate guidance ahead of the surgical procedure and/or during the surgical procedure. Guidance provided during the surgical procedure may include an indication of an instrument which should be selected (from the surface 300) for provision to the operator at a certain stage and/or validation of a selection of an instrument which has been selected.

In this way, inaccurate or incomplete selection of a surgical instrument can be avoided.

Accordingly, the system described with reference to the example of Figure 2 of the present disclosure provides a more reliable way of preparing and selecting surgical instruments (such as those instruments required for, or during, a surgical procedure). Indeed, the system is able to provide guidance and control during the preparation, selection and transfer of surgical instruments and thus may reduce incidents caused by human error in the perioperative cycle.

Further details of embodiments of the disclosure will now be provided.

As explained with reference to Figure 2 of the present disclosure, the system comprises an image capture device 100.

In the example of Figure 2 of the present disclosure, a single image capture device 100 is provided. However, the present disclosure is not particularly limited in this respect In examples, a plurality of image capture devices can be provided. In examples, each image capture device configured to acquire image data from a particular viewpoint This can improve the identification of the instruments, since identification can still be performed even if the view of the instruments from one of the plurality of image capture devices is obscured (such as if a person (or other object) moves between the image capture device and the surface 300 in the line-of-sight of the image capture device).

However, in examples, a single image capture device 100 can be provided. Accordingly, the present disclosure is not particularly limited to a situation whereby a plurality of image capture devices are provided. Figure 3 of the present disclosure illustrates the configuration of an image capture device 100 in accordance with embodiments of the disclosure.

The image capture device 100 comprises an image acquisition unit 102, a processing unit 104 and a communication unit 106.

In examples, at least the image capture device 100 of the system can be mounted to a surgical instrument table (an example of the surface 300), such that at least a so-called bird’s-eye view image of the instruments can be acquired and image or video projections on the table surface can be realized. However, more generally, the image capture device 100 can be mounted at any suitable location within the surgical environment from where it is able to acquire image data of the surface 300.

The image acquisition unit 102 may comprise the hardware for acquiring image data of the scene. For example, the image acquisition unit 102 may comprise a focusing element for focusing light from the scene and a detector for converting the light from the scene into image data. The focusing element may be a reflective or refractive element For example, the focusing element may comprise a lens, or lens system, for focusing the light from the scene onto the detector. The detector may, in examples, be a type of image sensor such as a charge-coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS). The detector may provide image data in any suitable format depending on the situation to which the embodiments of the present disclosure are applied. In examples, the image acquisition unit may acquire high definition image data (such 4K, 8K or higher resolutions).

The image data acquired by the image acquisition unit 102 is provided to the processing unit 104 of the image capture device. The processing unit 104 may be a microprocessor carrying out computer instructions or may be an Application Specific Integrated Circuit Computer instructions may be stored on storage medium (not shown) which may be a magnetically readable medium, optically readable medium or solid state type circuitry. The storage medium may be integrated into the image capture device or may be separate to the image capture device and connected thereto using either a wired or wireless connection. The computer instructions may be embodied as computer software that contains computer readable code which, when loaded onto the processing unit 104, configures the processing unit 104 to perform a method according to embodiments of the disclosure.

The processing unit 104 of the image capture device 100 may be configured to perform processing to analyse the image data from the image acquisition unit 102 to identify one or more instruments located in the scene. The type of analysis performed by the processing unit 104 of the image capture device 100 to identify the one or more instruments located in the scene is not particularly limited. In some examples, the processing unit 104 of the image capture device 100 may be configured to perform object recognition processing on the image capture data as a type of image processing in order to identify the one or more instruments located in the scene.

Furthermore, in some examples, the processing unit 104 of the image capture device 100 may employ the use of a trained model to identify the one or more instruments located in the scene.

Indeed, in examples, the image capture device 100 is an example of a so-called edge Al camera, which can provide image analysis and video analytics at the edge of a network (thus maximizing network and bandwidth efficiencies). In other words, the processing (such as Artificial Intelligence (Al) processing, utilizing the use of a trained model) can be performed directly in the image capture device 100.

In accordance with embodiments of the disclosure, all image data captured by the image capture device 100 is processed locally on the device. Furthermore, this data can be purged right after the metadata is inferred from the Al model. Accordingly, output data of the system (including the metadata transferred to the server 200) is not personally identifiable such that the system can comply with all data restrictions (including, for example, compliance with requirements of GDPR).

The image capture device 100 (of which an edge Al camera device is an example) uses object detection and recognition and can infer the current set of instruments on the table as well as the current instrument held by the human preparation staff standing at the instrument table.

Object recognition describes the computational processes used to identify objects in digital images. Object recognition may refer to image classification, object localisation or object detection. Image classification refers to sorting images into classes. Object localisation refers to locating objects within an image and demarcating the object in the image with, for example, a bounding box. Object detection refers to locating objects within an image, demarcating the object in the image with, for example, a bounding box, and sorting the objects in the image into classes.

Object recognition is recognised as an important problem in the field of computer vision and has a wide variety of applications in areas such as surveillance and security, autonomous driving, or other applications which employ object tracking or segmentation.

As Al continues to develop and improve, object recognition is becoming increasingly effective. One branch of Al which has had particular success in object recognition is "machine learning”. Machine learning models may use supervised learning, unsupervised learning and/or reinforcement learning.

The processing unit 104 of the image capture device 100 may utilize machine learning models, as an example of a trained model, in the analysis of the image data acquired by the image acquisition unit 102.

A supervised learning model is trained using labelled training data to learn a function that maps inputs (typically provided as feature vectors) to outputs (i.e. labels). The labelled training data comprises pairs of inputs and corresponding output labels. The output labels are typically provided by an operator to indicate the desired output for each input The supervised learning model processes the training data to produce a function that can be used to map new (i.e. unseen) inputs to a label. The input data (during training and/or inference) may comprise various types of data, such as numerical values, images, video, text, or audio. Raw input data may be pre-processed to obtain an appropriate feature vector used as input to the model - for example, features of an image (such as edges) may be extracted to obtain a corresponding feature vector.

It will be appreciated that the type of input data and techniques for pre-processing of the data (if required) may be selected based on the specific task the supervised learning model is used for. Once prepared, the labelled training data set is used to train the supervised learning model.

During training the model adjusts its internal parameters (e.g. weights) so as to optimize (e.g. minimize) an error function, aiming to minimize the discrepancy between the model’s predicted outputs and the labels provided as part of the training data. In some cases, the error function may include a regularization penalty to reduce overfitting of the model to the training data set. The supervised learning model may use one or more machine learning algorithms in order to learn a function which provides a mapping between its inputs and outputs. Example suitable learning algorithms include linear regression, logistic regression, artificial neural networks, decision trees, support vector machines (SVM), random forests, and the K-nearest neighbour algorithm.

In the present disclosure, the training data used to train the trained model may include images of different surgical instruments utilized during a range of surgical procedures. The images of the different surgical instruments may include images of the surgical instruments at different positions (location and orientation) within the surgical environment The images of the different surgical instruments may include images of the surgical instruments from different suppliers and manufacturers. The images of the different surgical instruments may include images with different image capture conditions (different image brightness and contrast, for example). The images of the different surgical instruments may also include images of the surgical instruments at different levels of occlusion (where an object at least partially lies in the line-of-sight between the image capture device and the surgical instrument). The images of the different surgical instruments may also include images of the surgical instruments as they are being held or operated by a person.

The images used as part of the training data to train the trained model may further include synthetic images which have been generated or adapted in order to further increase the range of training data which is provided.

Once trained, the supervised learning model may be used for inference - i.e. for predicting outputs for previously unseen input data. The supervised learning model may perform classification and/or regression tasks. In a classification task, the supervised learning model predicts discrete class labels for input data, and/or assigns the input data into predetermined categories. In a regression task, the supervised learning model predicts labels that are continuous values.

Unsupervised learning models differ from supervised learning models in that the training data is not labelled. Unsupervised models are therefore suited to discovering new patterns and relations in raw unlabelled data whereas supervised learning is more suited to learning relationships between input data and output labels. An autoencoder is an example of an unsupervised machine learning model.

In reinforcement learning, an agent interacts with an environment by performing actions and learns from the results of its actions based on feedback, thereby enabling the agent to progressively improve its decision making. The reinforcement learning algorithm may rely on a model of the environment (e.g. based on Markov Decision Processes (MDPs)) or be model-free. Example suitable model-free reinforcement learning algorithms include Q-learning, SARSA (State-Action-Reward- State-Action), Deep Q-Networks (DQNs), or Deep Deterministic Policy Gradient (DDPG).

The processing unit 104 of the image capture device 100 may also utilize deep learning during analysis of the image data acquired by the image acquisition unit 102.

Deep learning is a specialised type of machine learning. Deep learning algorithms have had particular success in performing object recognition tasks. In particular, deep learning algorithms have been developed which are considered to be robust to occlusion, and able to cope with complicated scenes and difficult illumination. Deep learning models are based on artificial neural networks with representation learning. Representation learning allows a model to discover representations needed for feature detection or classification from raw data. This is in contrast to other machine learning models where, for example, features are manually extracted from images and supplied as input data to the model. Accordingly, deep learning models can use images as the input and do not require manual extraction of features from the image. An example of using a supervised deep learning model to perform object recognition is discussed below:

In order to perform an object recognition task, a supervised deep learning model may be trained based on training data such as a set of images labelled with known objects. For example, input data (i.e. an image containing an object) and output labels (i.e. an identity and location of the object in the image) are provided to the deep learning model.

The deep learning model processes the training data to learn a function which maps the input data to the output labels. Then, when new input data (such as a new image containing an unknown object) is provided to the deep learning model, the deep learning model uses the function to provide output labels (such as a location and identity of the unknown object in the new image) based on the new input data.

Although the above example discusses a supervised deep learning model, it will be appreciated that deep learning models may additionally, or alternatively, use unsupervised learning and/or reinforcement learning.

Deep learning models in the field of object recognition include the Region-Based Convolutional Neural Network (R-CNN) family of models, and the You Only Look Once (YOLO) family of models. The R-CNN family of models includes the R-CNN model, the Fast R-CNN model and the Faster R-CNN model. The YOLO family of models contains eight different YOLO versions, the most recent version being YOLO v.8. Although the R-CNN family of models may generally be more accurate than the YOLO family of models, the YOLO family of models are much faster and therefore allow object recognition in real time. In embodiments, the Al models used for object recognition may use the Y0L0v8 model. Furthermore, in embodiments, the Al models used for object recognition can include a latest YOLO model.

Accordingly, the processing unit 104 of the image capture device is configured to analyse the image data to identify one or more instruments located in the scene (such as the instrument, or instruments, provided on the surface 300 as described with reference to Figure 3 of the present disclosure). For example, the processing unit 104 can identify the type of instrument and/or the position of the instrument (location and orientation). In addition, the processing unit 104 can identify the number of instruments which are present in the surgical environment

Once the data has been analysed by the processing unit 104, the processing unit 104 generates associated metadata information (including the information such as the type and position of the instruments which have been identified in the scene). This metadata information is then passed to the communication unit 106 of the image capture device for transmittance to the server 200. At this stage, the image data acquired by the image capture unit 102 can be purged. Accordingly, storage and network overheads and resources required by the system can be reduced, since the image data is neither transferred nor stored locally on the image capture device.

Furthermore, the metadata generated by the image capture device for transmission to the server is anonymised data. That is, it does not contain any sensitive information such as information which could be used in order to identify a patient assistant within the surgical environment Furthermore, in examples where the metadata includes information such as a detail of pre-existing medical condition of the patient, the data is anonymised prior to its inclusion within the metadata information, such that this information cannot be linked with the patient Therefore, the metadata which has been generated by the image capture device can be transmitted externally to the surgical environment (for processing by the server) while complying with all relevant data usage restrictions (such as the restrictions imposed by GDPR).

The communication unit 106 of the image capture device can utilize any suitable wired or wireless communication mechanism to transfer the metadata to the server 200. For example, the communication unit 106 may be configured to communicate with the server over a network such as the internet

As explained with reference to Figure 2 of the present disclosure, the system comprises a server 200.

Figure 3 of the present disclosure illustrates an example configuration of the server 200 in accordance with embodiments of the disclosure.

The server comprises a storage unit 202, a processing unit 204 and a communication unit 206.

The storage unit 202 is shown integral to the server 200. However, the storage unit 202 may be integrated into the server or may be separate to the server 200 and connected thereto using either a wired or wireless connection. The way in which the processing unit 204 of the server 200 is configured to generate the guidance information and perform control of the display device for display of the guidance information to the assistant is not particularly limited in accordance with embodiments of the disclosure. Indeed, the type of processing performed and/or the manner of provision of this guidance information may vary depending on the situation to which the embodiments of the disclosure are applied.

However, it will be appreciated that generation of the guidance information comprises a process of firstly a comparison of the metadata information which has been received from the image capture devices with a standard protocol and secondly the determination of guidance to provide to the assistant on a basis of this comparison.

The storage unit 202 may store information concerning different protocols for different surgical operations. A protocol may define the different surgical instruments which are required for a specific surgical operation. Alternatively or in addition, a protocol may define the different instruments which are required at different stages during the surgical operation. Alternatively or in addition, the protocol may define an optimal location (or indeed, a range of locations) at which the instrument should be located on a surface (such as the surgical tray) in order that the instruments can be more easily accessed when required during the surgical procedure. For example, a first instrument may often be required to be used at a similar time as a second instrument Accordingly, it may be that these two instruments should be located in close proximity to each other on the surface. Alternatively, for example, the protocol may define that the instruments should be located on the surface in a particular order or arrangement (such as in the order in which they are required to be used during the surgical procedure).

The protocols may be defined in advance based on standard medical practice. The protocols may be defined based upon the experience of medical professionals who have conducted previous surgical procedures. A protocol, once stored in the storage unit, may be updated on a regular basis in accordance that the protocol remains up-to-date with developments in standard medical practice and procedure. Accordingly, the protocol can be considered to provide an appropriate guide and/or checklist against which the instruments which have been detected can be compared.

In examples, the metadata received from the image capture device may indicate a particular surgical procedure or operation in addition to the information concerning the instruments which have been identified. The surgical procedure or operation which has been indicated in the metadata may then be used by the server in order to retrieve, from the storage unit, the relevant protocol for the current surgical procedure. In examples, the server may be configured to identify the surgical procedure in accordance with information which has been provided (such as a surgical timetable for the surgical environment or the like).

In examples, comparison between the standard protocol and the identified instruments (from the metadata information) can be performed by the processing unit 204 of the server 200 through a direct comparison. For example, if instruments A, B and C have been identified, yet the standard protocol indicates that the instruments A, B, C and D are required for the surgical procedure, the processing unit 204 of the server 200 may identify that the instrument D is currently missing from those instruments which have been identified. Accordingly, the server may generate guidance information indicating that the surgical instrument D is missing and should be included in order to provide a complete set of surgical instruments for the surgical procedure.

In examples, the processing unit 204 may be configured in order to use a trained model (such as that described with reference to the image capture device 100) in order to generate the guidance information. The trained model may be a machine learning model. The trained model may be trained using supervised learning, unsupervised learning and/or reinforcement learning, for example.

The trained model may be trained on training data which includes a list of instruments used for certain surgical procedures (or certain stages of the surgical procedures). The training data may also include the placement of those instruments on the surface. The training data may include data from past surgical events for example. The training data may also include data such as standard medical protocol information, indicating the types of instruments which should be located and the location of the instruments on the surface. The model may be trained on this training data such that for any input surgical procedure or process, it can generate a list (as output) of the instruments which are required for that surgical procedure or process and, furthermore, the optimal location of the surgical instruments on the surface for that surgical procedure or process.

In examples, the training data used to train the trained model may also include additional information, such as the type of operator performing the surgical procedure and/or any linked or related medical conditions (such as pre-existing conditions of patients). This information can be used to train the trained model in order that it can identify differences between the instruments which should be used for a same surgical procedure for different operators and/or for different pre-existing medical conditions of the patient

The processing unit 204 of the server 200 may also be configured in order to generate a different type of guidance information depending on a type of display unit which is available for display of the guidance information to the assistant (or other person) within the surgical environment Different types of guidance information and the manner of provision of this guidance information to the assistant (or other person) within the surgical environment is described in more detail with reference to Figure 7 of the present disclosure.

Once the guidance information has been generated by the processing unit 204 (through direct comparison with the standard protocol and/or through use of a trained model or the like) the processing unit may perform control of a display device to display the guidance to a user, the guidance comprising an indication of a surgical instrument The server may use the communication unit 206 in order to communicate with and perform control of the display device.

Configuration of Image Capture Devices>

Consider, now Figure 4 of the present disclosure. Figure 4 of the present disclosure illustrates an example configuration of imaging devices in accordance with embodiments of the disclosure.

Figure 4 provides an example of how different image capture devices can be arranged in a surgical environment when a plurality of image capture devices are provided as part of the system. That is, it will be appreciated that in examples a single image capture device 100A may be provided as part of the system. However, in other examples, a plurality of image capture devices may be provided. Use of a plurality of image capture devices can further improve the accuracy of instrument identification, since each of the plurality of image capture devices can be arranged in order to provide a unique viewpoint of the instrument surface. This makes it less likely that the view of the instrument tray will become obstructed.

In the example of Figure 4, four additional image capture devices 100B are provided as part of the system. Each of the four additional image capture devices provides a unique view of the surface 300 on which the assistant 302 arranges the instruments ahead of the surgical procedure.

In this example, each of the four additional image capture devices 100B are secondary image capture devices, while the image capture device 100A is the primary (or master) image capture device. The image capture devices (including the primary image capture device and any secondary image capture devices) can communicate using any suitable wired or wireless communication. In examples, the image capture devices may communicate locally with each other using wireless communication. In examples, the image capture devices may communicate locally with each other using a wireless communication standard such as 5 G wireless communication. In examples, a wireless communication technology such as Bluetooth can be used for close-range data transmission between the respective image capture devices.

In examples, each of the secondary image capture devices is configured to acquire image data of the scene (from its unique viewpoint). Then, instead of processing the image data to generate the metadata, the secondary image capture device transmits the image data via a local communication mechanism to the master image capture device. The master image capture device is then responsible for processing the image data that it has acquired and the image data received from the other image capture devices (the secondary image capture devices) in order to identify the instruments located on the surface 300.

For example, the master image capture device may collate the image data which has been acquired and provide this as input to the trained model for identifying of the instruments located on the surface. In this way, the trained model (or other means of instrument identification) has access to the image data from across all the different image capture devices at the time at which the instrument identification is performed. This is advantageous, as increased input image data to the model will result in improved reliability and accuracy of the instrument identification (and reduces likelihood of conflict as may arise if each image capture device individually and independently used a trained model to its own image data.

Once the master image capture device has performed the identification processing and generated the metadata information, the image data can be purged and only the metadata information can be transmitted to the server. Accordingly, the system still achieves compliance with data restrictions such as GDPR, since the image data is only transmitted locally within the surgical environment (between image capture devices) and is not transmitted externally to the server.

While the example of Figure 4 has been described with reference to an example with one primary image capture device and four secondary image capture devices, it will be appreciated that the present disclosure is not particularly limited in this regard. In examples, there may only be a single image capture device. In examples, there may be only a single secondary image capture device in addition to the primary image capture device. In examples, there may be a plurality of secondary image capture devices in addition to the primary image capture device. In examples, the number of secondary image capture devices may be much higher that the number of secondary image capture devices described with reference to Figure 4 of the present disclosure. Alternatively, the number of secondary image capture devices may be less than the number of secondary image capture devices shown in Figure 4 of the present disclosure. Moreover, the location of the image capture devices is not limited to that shown in the example of Figure 4 of the present disclosure. More generally, the image capture devices may be arranged at any given location within the surgical environment, provided that the image capture device is able to capture an image of the surface 300 upon which the instruments are arranged.

Turning, now, to Figure 5 of the present disclosure, a specific example implementation of a system of the present disclosure to a surgical procedure is illustrated.

In this example, an assistant 5000 is arranging surgical instruments for an upcoming surgical procedure on a surface. In this example, the surface is a surgical tray 5002 on a table in the surgical environment (the surgical theatre).

Within the surgical environment, a plurality of image capture devices 100 are provided. This includes a main image capture device (the main edge Al camera in this example) and at least one secondary image capture device (the sub edge Al camera in this example). Each of the main and secondary image capture devices acquires image data of the instruments on the surgical tray.

In examples, the image capture devices may be mounted to the surgical tray (or the table on which the surgical tray is placed) in order that an improved view of the surgical tray can be provided. However, in other examples, the image capture devices may be mounted at any location within the surgical environment, provided that the image capture devices can acquire an image of the surgical tray. This may include mounting the image capture devices on the floor, wall or ceiling within the surgical environment

The detection accuracy of the system can be further enhanced by using additional image capture devices with additional viewpoints on the surgical tray, enabling a better differentiation of surgical instruments from different viewpoints (e.g. prevent occlusion, reduce detection ambiguity). In examples, the image capture devices may be arranged at certain locations within the surgical environment during an initial calibration phase of the system.

Once the second image capture device acquires image data, it provides that image data over a local network to the main image capture device. Since transfer of the image data is made only over the local network (such as an intranet) full compliance with data restrictions (such as GDPR) governing the use of sensitive data can be observed.

Then, once the main image capture device has acquired its own image data and received the image data from the secondary image capture devices, the main image capture device can analyze the image data (using a vision-based edge Al trained model, for example) to generate metadata indicating at least one of the type and position (including location and/or orientation) of the instruments which have been identified. That is, the main image capture device can use object detection and recognition to infer the current set of instruments on the table as well as the current instrument held by the human preparation staff standing at the instrument table.

This information can then be transmitted to a server for additional processing and analysis. In particular, the server can use the metadata to compare the instruments which have been identified with a target instrument set (based, for example, on a standard protocol). Guidance information can then be generated on the basis of this comparison for provision to the assistant

In the example of Figure 5, three different types of display devices for providing the guidance information to the assistant are shown. The display devices are devices capable of providing audio and/or visual information to the assistant

A first display device 5008A is an image projector. The image projector 5008A is located within the surgical environment (the surgical theatre) and configured such that it can project light (such as an image) onto the surgical tray 5002. In examples, the image projector 5008A may be mounted on a ceiling above the surgical tray 5008A. Accordingly, the image projector 5008A can be used to project information on the surgical tray (or other surface, such as an instrument table) to provide guidance to the assistant

The nature of the information provided to the assistant (using the image projector 5008) may vary depending on the situation to which the embodiments of the disclosure are applied.

However, in examples, the projector 5008A may be controlled in order to highlight one or more surgical instruments in the scene. The one or more surgical instruments may include an instrument which should be used next during the surgical procedure (and will need to be passed to a surgeon). The one or more surgical instruments may include an instrument which has been incorrectly placed on the surgical tray and should therefore be removed (such as an instrument which is not required during the surgical procedure). The one or more surgical instruments may include a specific instrument from amongst the instruments on the surgical tray which have been requested by the surgeon (with the request of the surgeon being identified in accordance with voice recognition processing, for example).

Alternatively or in addition, the projector 5008A may be controlled to project images of surgical instruments in the scene. The images of the surgical instruments in the scene may include images of a set of surgical instruments which are required for the surgical procedure (based on a standard protocol). The images of the surgical instruments may also include images of the surgical instruments already present on the surgical tray, but at a different location to the actual location of those instruments. In this way, the image projector 5008 may be used in order to guide the assistant to position the instruments at an optimal location on the surgical tray; that is, a visual support of what instrument should be placed and where can be provided to the assistant The images of the surgical instruments may also be used in order to provide a visual representation to the assistant of any missing instruments.

In this way, the system can be used in order to validate the surgical instruments which have been provided in the scene or selected by the assistant In addition, the system can be used to provide a recommendation concerning a set of instruments which should be used and/or suggest any extension to the set of surgical instruments which have been provided. Moreover, during the surgical procedure, the system can be used in order to highlight a required surgical instrument and/or indicate a surgical instrument which will be required in the next stage of the surgical procedure.

A second display device which may also be provided, either alternatively or in addition, is an audio generation device 5008B. In examples, the audio generation device 5008B may be a device such as a speaker or the like. The audio generation device 5008B may be used in order to provide audio instructions to the assistant as guidance information.

The audio instructions provided to the assistant may vary depending on the situation to which the embodiments of the disclosure are applied. For example, the audio instructions may vary depending on whether or not the audio generation device is used in accordance with a visual display device (such as the image projector 5008A) or whether the audio generation device is used as an independent display device. Furthermore, the audio instruction may vary depending on whether or not the system is used for guidance during preparation of the surgical instruments (ahead of a surgical procedure) or whether the system is used for guidance in provision of the surgical instruments (during the surgical procedure).

In the example of Figure 5 of the present disclosure, the system is used during preparation of the surgical instruments (ahead of a surgical procedure). Therefore, the guidance provided by the audio generation device 5008B includes guidance information for the assistantwhen preparing the surgical instruments on the surface (here, the surgical tray on the table). Moreover, in this specific example, the audio generation device 5008B is used in combination with a visual display device (such as the projector 5008A or a display screen 5008C). Accordingly, the audio generation device provides the instruction, "Please place the selected instrument onto the marked area in tray". The assistant is therefore guided to place the instrument they are currently holding at a correction location on the surgical tray (with the marked area in the tray being indicated by a display device such as the projector 5008A). In this way, the system can be used in order to provide guidance to the assistant such that a more reliable preparation and selection of surgical instruments can be achieved. Thus, the safety and efficiency of surgical procedures can be improved, thus further improving patient outcomes from surgical procedures.

A third display device which may also be provided, either alternatively or in addition, is the display screen 5008C. The display screen is a type of display device (such as a computer monitor, television screen or the like) which is able to display visual information that can be seen by a person viewing the screen. The display screen may also be able to provide audio information; that is, the display screen may contain an audio generation device such as the audio generation device 5008B.

While the projector displays visual information to the assistant by projecting light or images onto a surface (such as the tray), the display screen displays visual information on the screen such that a person looking at the screen will be presented with the desired visual information.

The nature of the information which is displayed to the person (such as the assistant) on the display screen 5008C may vary depending upon the situation to which the embodiments of the disclosure are applied. For example, the information may vary depending upon whether or not the display screen 5008C is the only display device which is present Furthermore, a variation in the information which may be provided may occur depending on whether the system is being used in order to provide guidance during the preparation of instruments (ahead of the surgical procedure) or whether the system is being used in order to guide the assistance in providing the surgeon with a desired instrument (during the surgical procedure).

Figure 6 illustrates an example provision of guidance in accordance with embodiments of the disclosure. In the example of Figure 6, a detailed illustration of an image as may be provided on the display screen 5008C is shown.

In this specific example, the system is being used in order to provide guidance during the surgical procedure. Therefore, the guidance information is information which will assist in the accurate and reliable selection of the instrument which should be provided to the surgeon at a given stage during the surgical procedure.

Accordingly, in this example, the display screen shows an image of the instruments on the surgical tray. Furthermore, the instrument which should be passed to the surgeon is highlighted on the display screen. In this example, instrument A is highlighted as the instrument which should be selected. In this example, the instrument is highlighted by a box around the instrument on the display screen and accompanying text next to the image on the display screen ("Select Instrument: A"). However, the present disclosure is not particularly limited to this example. That is, any suitable method of highlighting the instrument, using the display screen, may be provided in accordance with embodiments of the disclosure. For example, the brightness, contrast or colour of the instrument to be selected may be changed. Alternatively, an element (such as an arrow or the like) may be provided to highlight the item for selection. Alternatively, a size of the instrument in the image on the display screen may be changed to highlight the item for selection.

In addition, in this example, an instrument which will be required by the surgeon in a next stage of the surgical procedure is also highlighted. In this example, this is shown by the text "Upcoming Instrument: 5" on the display screen. However, similar to the instrument to be selected, the manner by which the upcoming instrument is highlighted is not particularly limited in accordance with embodiments of the disclosure.

Thus, the display screen can be used to present information to the human staff. The information presented will vary depending on the situation to which the embodiments of the disclosure are applied. In examples, a target instrument set can be presented on the display screen (e.g. a set determined in accordance with the standard protocol). Furthermore, in examples, a current instrument which is required can be highlighted (as described above). In examples, such as when providing guidance during instrument preparation, a checklist may be presented to the operation technical assistant, showing e.g. the standard protocol against the actual detection results, in order to quickly identify missing instruments in the prepared set. In examples, a set of instruments that differs from the standard protocol, depending on the surgeon’s preferences (e.g. derived from previous similar operations) can be presented.

In examples, an extended set of instruments (beyond those which have already been selected (e.g. based on the standard protocol) may be presented, based on the patient’s pre-existing conditions. This enables the assistant to immediately identify which instruments may be required for the specific patient beyond the instruments which should generally be provided (e.g. based on the standard protocol).

In examples, the current set of instruments (as on the instrument table) may be displayed. Then, a requested instrument (based on an instruction from a surgeon) and a next instrument according to standard protocol may be displayed.

In this way, the system can be used in order to provide guidance to the assistant such that a more reliable preparation and selection of surgical instruments can be achieved. Thus, the safety and efficiency of surgical procedures can be improved, thus further improving patient outcomes from surgical procedures. <Method>

Hence, more generally, a method of providing guidance for selection of a surgical instrument is provided in accordance with embodiments of the present disclosure.

Figure 7 illustrates an example method in accordance with embodiments of the present disclosure. The example method described with reference to Figure 7 of the present disclosure may be performed by a system such as that described with reference to Figure 2 of the present disclosure (i.e. a system including an image capture device 100 and a server 200). In addition, in embodiments of the disclosure a computer program product may be provided which, when implemented by a computer (or, by the components of the system described with reference to Figure 2 of the present disclosure) causes the computer (or the components of the system described with reference to Figure 2 of the present disclosure) to perform the method of embodiments of the present disclosure. In examples, the computer program may be stored on a non-transitory computer readable storage medium.

The example method of Figure 7 of the present disclosure starts at step S700 and proceeds to step S702.

In step S702, the method comprises acquiring image data of a scene using an image capture device.

In step S704, the method comprises analysing, at the image capture device, the image data to identify one or more instruments located in the scene.

In step S706, the method comprises transmitting the metadata, generated in accordance with the identified instruments, to a server.

In step S708, the method comprises processing the metadata, at the server, to generate guidance in accordance with the surgical instruments which have been identified and a standard protocol.

In step S710, the method comprises controlling a display device to display the guidance to a user, the guidance comprising an indication of a surgical instrument

The method then proceeds to and ends with step S712.

Embodiments of the present disclosure are not particularly limited to the number and arrangement of method steps which have been illustrated with reference to the specific example of Figure 7 of the present disclosure. Any number of additional steps may be provided. Alternatively, or in addition, at least one or more of the steps of the method of Figure 7 may be performed in sequence and/or in parallel with the other steps of the method. Hence, the present disclosure is not particularly limited in this regard.

Further embodiments of the present disclosure are provided in accordance with the following numbered clauses:

1. A system for providing guidance for selection of a surgical instrument, the system comprising: an image capture device (100) configured to: acquire (S702) image data of a scene; analyse (S704) the image data to identify one or more instruments located in the scene (300); and transmit (S706) metadata, generated in accordance with the identified instruments, to a server; and the server (200), wherein the server is configured to: process (S708) the metadata to generate guidance in accordance with the surgical instruments which have been identified and a standard protocol; and control (S710) a display device (5008A, 5008B, 5008C) to display the guidance to a user, the guidance comprising an indication of a surgical instrument

2. The system according to clause 1, wherein the image capture device is configured to perform object recognition to identify one or more instruments located in the scene.

3. The system according to clause 1 or 2, wherein the image capture device is configured to use a trained model to identify one or more instruments located in the scene.

4. The system according to any preceding clause, wherein the standard protocol is related to a surgical procedure to be performed and the server is configured to control a display device to display the guidance to the user, the guidance comprising an indication of the surgical instrument for the surgical procedure. 1 5. The system according to clause 4, wherein the indication of the surgical instrument for the surgical procedure comprises a highlight of an instrument for use in a current stage of the surgical procedure.

6. The system according to clause 4 or clause 5, wherein the indication of the surgical instrument for the surgical procedure comprises a set of instruments for use in the surgical procedure.

7. The system according to any of clauses 4 to 6, wherein the indication of the surgical instrument for the surgical procedure comprises a display an extended set of surgical instruments for use in the surgical procedure in addition to the instruments which have been identified.

8. The system according to any of clauses 4 to 7, wherein the server is configured to determine the surgical procedure in accordance with at least one from a list comprising: image data, input from a user, and a surgical schedule.

9. The system according to clause 8, wherein the server is configured to receive audio data from the user as input to determine the surgical procedure.

10. The system according to any preceding clause, wherein the guidance comprises at least one of a list comprising: guidance information for validation of a surgical instrument for the surgical procedure, guidance information for extension of the surgical instruments for the surgical procedure, and guidance information for recommendation of a surgical instrument for the surgical procedure.

11. The system according to clause 10, wherein extension of the surgical instruments includes an indication of one or more additional surgical instruments which should be provided. 12. The system according to any preceding clause, wherein the image capture device is configured to purge the image data which has been acquired once the metadata has been transmitted to the server.

13. The system according to any preceding clause, wherein the trained model is trained on training data comprising images of surgical instruments.

14. The system according to any preceding clause, wherein the system comprises a plurality of image capture devices, each image capture device configured to acquire image data from a particular viewpoint

15. The system according to clause 14, wherein each of the plurality of image capture devices is configured to acquire image data of the scene; and transmit the image data to a first image capture device; wherein the first image capture device is configured to analyse the image data which has been acquired by the plurality of image capture devices to identify one or more instruments located in the scene and transmit metadata, generated in accordance with the identified instruments, to a server.

16. The system according to any preceding clause, wherein the server is further configured to control an audio device to provide the guidance to the user.

17. The system according to any preceding clause, wherein the guidance further comprises an indication of a target location of a surgical instrument in the surgical scene for the surgical procedure.

18. A method of providing guidance for selection of a surgical instrument, the method comprising: acquiring (S702 ) image data of a scene using an image capture device (100); analysing, at the image capture device, the image data to identify one or more instruments located in the scene; and transmitting the metadata, generated in accordance with the identified instruments, to a server; processing the metadata, at the server, to generate guidance in accordance with the surgical instruments which have been identified and a standard protocol; and controlling a display device to display the guidance to a user, the guidance comprising an indication of a surgical instrument

19. A computer program product comprising instructions which, when executed by the computer, cause the computer to perform a method according to clause 19.

20. A non-transitory computer readable storage medium comprising the computer program product according to clause 19.

It will be appreciated that numerous modifications and variations of the present disclosure are possible in light of the above teachings. It is therefore to be understood that within the scope of the appended claims, the disclosure may be practiced otherwise than as specifically described herein.

In so far as embodiments of the disclosure have been described as being implemented, at least in part, by software-controlled data processing apparatus, it will be appreciated that a non- transitory machine-readable medium carrying such software, such as an optical disk, a magnetic disk, semiconductor memory or the like, is also considered to represent an embodiment of the present disclosure.

It will be appreciated that the above description for clarity has described embodiments with reference to different functional units, circuitry and/or processors. However, it will be apparent that any suitable distribution of functionality between different functional units, circuitry and/or processors may be used without detracting from the embodiments.

Described embodiments may be implemented in any suitable form including hardware, software, firmware or any combination of these. Described embodiments may optionally be implemented at least partly as computer software running on one or more data processors and/or digital signal processors. The elements and components of any embodiment may be physically, functionally and logically implemented in any suitable way. Indeed the functionality may be implemented in a single unit, in a plurality of units or as part of other functional units. As such, the disclosed embodiments may be implemented in a single unit or may be physically and functionally distributed between different units, circuitry and/or processors.

Although the present disclosure has been described in connection with some embodiments, it is not intended to be limited to the specific form set forth herein. Additionally, although a feature may appear to be described in connection with particular embodiments, one skilled in the art would recognize that various features of the described embodiments may be combined in any manner suitable to implement the technique.

Claims

CLAIMS:

1. A system for providing guidance for selection of a surgical instrument, the system comprising: an image capture device configured to: acquire image data of a scene; analyse the image data to identify one or more instruments located in the scene; and transmit metadata, generated in accordance with the identified instruments, to a server; and the server, wherein the server is configured to: process the metadata to generate guidance in accordance with the surgical instruments which have been identified and a standard protocol; and control a display device to display the guidance to a user, the guidance comprising an indication of a surgical instrument

2. The system according to claim 1, wherein the image capture device is configured to perform object recognition to identify one or more instruments located in the scene.

3. The system according to claim 1 or 2, wherein the image capture device is configured to use a trained model to identify one or more instruments located in the scene.

4. The system according to claim 1, wherein the standard protocol is related to a surgical procedure to be performed and the server is configured to control a display device to display the guidance to the user, the guidance comprising an indication of the surgical instrument for the surgical procedure.

5. The system according to claim 4, wherein the indication of the surgical instrument for the surgical procedure comprises a highlight of an instrument for use in a current stage of the surgical procedure.

6. The system according to claim 4, wherein the indication of the surgical instrument for the surgical procedure comprises a set of instruments for use in the surgical procedure.

7. The system according to claim 4, wherein the indication of the surgical instrument for the surgical procedure comprises a display an extended set of surgical instruments for use in the surgical procedure in addition to the instruments which have been identified.

8. The system according to claim 4, wherein the server is configured to determine the surgical procedure in accordance with at least one from a list comprising: image data, input from a user, and a surgical schedule.

9. The system according to claim 8, wherein the server is configured to receive audio data from the user as input to determine the surgical procedure.

10. The system according to claim 1, wherein the guidance comprises at least one of a list comprising: guidance information for validation of a surgical instrument for the surgical procedure, guidance information for extension of the surgical instruments for the surgical procedure, and guidance information for recommendation of a surgical instrument for the surgical procedure.

11. The system according to claim 10, wherein extension of the surgical instruments includes an indication of one or more additional surgical instruments which should be provided.

12. The system according to claim 1, wherein the image capture device is configured to purge the image data which has been acquired once the metadata has been transmitted to the server.

13. The system according to claim 1, wherein the trained model is trained on training data comprising images of surgical instruments.

14. The system according to claim 1, wherein the system comprises a plurality of image capture devices, each image capture device configured to acquire image data from a particular viewpoint

15. The system according to claim 14, wherein each of the plurality of image capture devices is configured to acquire image data of the scene; and transmit the image data to a first image capture device; wherein the first image capture device is configured to analyse the image data which has been acquired by the plurality of image capture devices to identify one or more instruments located in the scene and transmit metadata, generated in accordance with the identified instruments, to a server.

16. The system according to claim 1, wherein the server is further configured to control an audio device to provide the guidance to the user.

17. The system according to claim 1, wherein the guidance further comprises an indication of a target location of a surgical instrument in the surgical scene for the surgical procedure.

18. A method of providing guidance for selection of a surgical instrument, the method comprising: acquiring image data of a scene using an image capture device; analysing, at the image capture device, the image data to identify one or more instruments located in the scene; and transmitting the metadata, generated in accordance with the identified instruments, to a server; processing the metadata, at the server, to generate guidance in accordance with the surgical instruments which have been identified and a standard protocol; and controlling a display device to display the guidance to a user, the guidance comprising an indication of a surgical instrument

19. A computer program product comprising instructions which, when executed by the computer, cause the computer to perform a method according to claim 18.

20. A non-transitory computer readable storage medium comprising the computer program product according to claim 19.