WO2025067654A1 - Suivi monoscopique de marqueurs - Google Patents
Suivi monoscopique de marqueurs Download PDFInfo
- Publication number
- WO2025067654A1 WO2025067654A1 PCT/EP2023/076869 EP2023076869W WO2025067654A1 WO 2025067654 A1 WO2025067654 A1 WO 2025067654A1 EP 2023076869 W EP2023076869 W EP 2023076869W WO 2025067654 A1 WO2025067654 A1 WO 2025067654A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- data
- camera
- subsection
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
- G06T7/75—Determining position or orientation of objects or cameras using feature-based methods involving models
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/006—Mixed reality
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/20—Scenes; Scene-specific elements in augmented reality scenes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20048—Transform domain processing
- G06T2207/20061—Hough transform
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30204—Marker
- G06T2207/30208—Marker matrix
Definitions
- the present invention relates to a tracking system and computer-implemented method for determining the spatial position of a tracking marker, a corresponding computer program, a computer-readable storage medium storing such a program and a computer executing the program.
- augmented reality devices particularly head-mounted augmented reality devices
- Typical augmented reality devices include a monoscopic camera providing a video stream of the practitioner's field of view, on which basis the supplemented information can be positionally aligned with the practitioner's optical perception of real word objects.
- Embedding these augmented reality devices into the infrastructure of navigation systems requires the coordinate system of the navigation system and the coordinate system of the augmented reality device to be spatially aligned, which is regularly done with the help of specific tracking references tracked over time by both, the augmented reality device and a tracking system assigned to the navigation system.
- the present invention has the object of providing a straightforward approach of embedding an augmented reality device into the infrastructure of a navigation system, particularly a medical navigation system, which may even allow for utilizing the augmented reality device's monoscopic camera for tracking purposes so as to supplement or even replace conventional tracking systems such as stereoscopic optical tracking systems.
- the present invention can be used for any tracking procedures, particularly in the medical field, which may involve the use of monoscopic tracking cameras for tracking purposes, thereby supplementing or even replacing conventional optical tracking systems such as Curve® or Kick®, both products of Brainlab AG.
- the invention encompasses acquiring a two-dimensional image including a depiction of an object, particularly of a tracking marker, employing artificial intelligence, particularly a machine-learning-model to extract information from the two-dimensional image so as to eventually determine the spatial position of the object with respect to the camera.
- the invention reaches the aforementioned object by providing, in a first aspect, a tracking system for determining the spatial position of a tracking marker, including a camera and a computer which is operatively coupled to the camera and has at least one processor adapted to carry out a method comprising the following steps:
- image section data describing an image section of the optical image, having a predefined pixel count in a horizontal and/or vertical direction of the image
- image subsection position data describing the extent and position of the plurality of image subsections within the image plane of the image received via the camera
- image feature position data describing a spatial position of the depictions of the features within the image plane of the image received via the camera
- marker position data describing the spatial position of the real-world tracking marker with respect to the camera.
- tracking marker should be understood to encompass any conceivable type of tracking marker, for example a marker including a plurality of retro-reflective marker spheres in a predefined spatial arrangement, a marker being defined by a single, particularly integrally formed structure which includes a plurality of distinguishable features, or a flat (2D-)marker having its distinguishable features basically disposed in a two-dimensional plane.
- spatial position may include the spatial location and/or the spatial orientation of an object.
- the spatial position may include three translational degrees of freedom and three rotational degrees of freedom in three- dimensional space.
- spatial position may also be referred to as "6D-pose”.
- differentiate electromagnetic signature as used herein may be understood as the specific optical appearance or the specific optical distinctiveness of a feature of the tracking marker in which the feature differs from other parts, particularly the remaining parts of the tracking marker.
- the image received via the camera may have already undergone any kind of pre-processing before being fed to the method described herein.
- the at least one image or video frame of a video stream is, after being received by the camera's CCD- or CMOS-sensor and before the inventive method sets in, may be processed in any conceivable manner, for example to optimize the image data for further processing.
- the image received via the camera is transferred to a predefined format having a predefined number of pixels in a vertical direction and/or a horizontal direction of the image so as to facilitate further processing of image information. If the image received by the camera already features the predefined pixel count in a horizontal and/or a vertical direction of the image, this step can be omitted.
- the image section can represent the entire image received via the camera, or a subset of the image received via the camera.
- the image section having a predefined pixel count is processed, thereby utilizing a machine-learning-algorithm that has been trained to detect the presence of a plurality of optically distinguishable features of the tracking marker within the image section referred to above.
- a machine-learning-algorithm can be trained to detect, i.e. to label a depiction of one or more features in the image section in association with one or more training images showing a depiction of this feature.
- This may involve the use of a convolutional network.
- Convolutional networks also known as convolutional neural networks or CNNs, are an example of neural networks for processing data that has a known grid-like topology. Examples include time-series data, which can be thought of as a 1 -D grid taking samples at regular time intervals, and image data, which can be thought of as a 2-D grid of pixels.
- the name “convolutional neural network” indicates that the network employs the mathematical operation of convolution. Convolution is a linear operation.
- Convolutional networks are simply neural networks that use convolution in place of general matrix multiplication in at least one of their layers. There are several variants on the convolution function that are widely used in practice for neural networks. In general, the operation used in a convolutional neural network does not correspond precisely to the definition of convolution as used in other fields, such as engineering or pure mathematics.
- an image subsection which may also be referred to as "bounding box”
- bounding boxes are defined which represent those parts of the image section, which show a depiction of a marker feature. These bounding boxes may have any conceivable shape, for example a rectangular shape or square shape.
- the bounding boxes defined in the image section having a predefined pixel count are transferred back to the image received via the camera and having the original pixel count or resolution.
- the position and extent of the bounding boxes that have initially been defined within the image section is determined within the image having the original resolution.
- the bounding boxes calculated for the image having the original resolution cover the same image content as their respective counterparts in the image section having the predefined resolution.
- the bounding boxes determined in the preceding step each encompass a depiction of a marker feature the position of which is determined within the respective bounding box defined for the image having the original resolution.
- this may involve applying one of a plurality of available approaches to calculate the position of the depiction of the feature within a respective bounding box.
- the information about the position of the bounding boxes within the image having the original resolution, and the information about the position of the feature depictions within the respective bounding boxes allows for calculating the position of the feature depictions within the image having the original resolution.
- the position of the plurality of features within the camera's field of view and projected onto the 2D-image-plane is accurately known.
- the expected relative position of the plurality of marker features may be taken or acquired from a three-dimensional model of a real-world tracking marker which is seen on the image received via the camera, i.e. a depiction of which is shown on the image.
- the information about the 2D-position of the features within the plane of the image received via the camera on the one hand, and the 3D-relative position of the features acquired from the model on the other hand eventually allows for determining the spatial position of the actual tracking marker with respect to the camera. This may involve the use of one of the applicable approaches described below.
- the camera In an example of the tracking system according to the first aspect, the camera
- - is adapted to receive electromagnetic radiation in the visible spectrum of light
- - is integrated in a wearable device, particularly in a head-mounted augmented reality device, specifically in AR-(augmented reality)glasses.
- the camera may be an RGB-camera. It may also be a camera which is capable of generating an RGB-image, particularly via a first CCD- or CMOS-sensor, as well as of generating an IR-(infrared)-image, particularly via a second CCD- or CMOS-sensor.
- the camera may also be a high-resolution-camera, particularly a 4K- camera or higher.
- the camera may form part of a head-mounted AR-device, particularly AR-glasses, which has a semi-transparent screen for having graphical information projected into the field of view of a person wearing the head-mounted AR-device.
- the camera may also be configured as "static" -camera, i.e. a camera which is embedded in a casing or housing meant to maintain its spatial position.
- the computer processor of the above-described tracking system according to the first aspect may be adapted to carry out any method described in the following.
- the tracking system according to the first aspect may include a camera and a computer which is operatively coupled to the camera and has at least one processor adapted to carry out the method according to any one of the claims 3 to 12.
- a method comprises executing, on at least one processor of at least one computer (for example at least one computer being part of a navigation system), the following exemplary steps which are executed by the at least one processor:
- image subsection position data describing the extent and position of the plurality of image subsections within the image plane of the image received via the camera
- image feature position data describing a spatial position of the depictions of the features within the image plane of the image received via the camera
- determining image section data involves: - a modification, particularly a reduction of the pixel count in a horizontal and/or in a vertical direction of the image received via the camera; and/or
- the image section may be obtained by "resizing" the image received via the camera.
- the image content of the image received via the camera remains the same in the image section, though with a reduced pixel count or resolution in a horizontal and/or vertical direction of the image.
- the image section may also be obtained by "cropping" the image received via the camera.
- the image section represents a cut-out of the image received via the camera.
- the image section is obtained by both, “resizing” and “cropping" the image received via the camera.
- determining feature detection data involves applying a convolutional-neural-network for an object- detection-model, particularly YOLOv5 or higher, specifically YOLOv8, to detect depictions of a plurality of features of the tracking marker within the image section.
- a convolutional-neural-network may be utilized for detecting depictions of the features of a tracking marker within the image section.
- YOLOv5 it becomes apparent that an image section having a predefined resolution needs to be obtained from a high resolution image as received via the camera.
- YOLOv5 is adapted to process images having a pixel count of 640 x 640 pixels so as to reduce computational effort to a reasonable level.
- the machine- learning-model particularly the object-detection-model, specifically YOLOv5 has been trained to detect depictions of the entire tracking marker and/or of the features of the tracking marker within the image section, particularly wherein the features have a predefined
- this may involve rejecting detected depictions of features lying remote from the detected depiction of the tracking marker, particularly lying outside a predefined boundary around the depiction of the tracking marker.
- a specific embodiment of the inventive approach may gather information from a conventional tracking system so as to distinguish between two or more tracking markers disposed close to each other:
- another tracking system may provide information on the three-dimensional position of these tracking markers and of respective instruments or devices so as to assign the detected features to a specific tracking marker.
- determining image subsection data involves defining the dimensions of at least one of the image subsections to have the image subsection enclose the entire depiction of a respective feature, with the ratio of the entire image subsection to those parts of the image subsection which do not depict the respective feature being above a predefined threshold.
- determining image subsection position data involves increasing the horizontal and/or vertical extent of at least one image subsection within the image received via the camera, and/or wherein determining image subsection position data involves reducing the ratio of the entire image subsection to those parts of the image subsection which do not depict the respective feature.
- determining feature subsection position data involves applying
- CNN convolutional-neural-network
- an edge-detection-algorithm particularly a canny-edge-detection-algorithm, which is particularly preceded by a median-blur-algorithm.
- sphere-centering-algorithm refers to a computational technique known in the art, which is used to precisely identify the center and radius of spherical objects within two-dimensional images.
- This method may employ various strategies, for example iterative approaches, the utilization of a Hough-transformation for circle detection, or the incorporation of a convolutional neural network (CNN) in the form of an encoder. Its primary goal is to achieve accurate center and radius determination for spherical objects within two-dimensional images.
- CNN convolutional neural network
- camera intrinsic parameter data describing at least one intrinsic parameter and/or at least one camera calibration parameter assigned to a respective image or frame.
- the inventive approach may also involve the use of a so-called calibrated camera for obtaining the optical image.
- calibrated camera for obtaining the optical image.
- determining marker position data involves applying a perspective-n-point (PnP)-algorithm.
- PnP-algorithm is an approach well known in the art for extracting a three-dimensional position (including three translational and three rotational degrees of freedom) of an object from a two-dimensional image of that object.
- the invention is directed to a computer program comprising instructions which, when the program is executed by at least one computer, causes the at least one computer to carry out method according to the first aspect.
- the invention may alternatively or additionally relate to a (physical, for example electrical, for example technically generated) signal wave, for example a digital signal wave, such as an electromagnetic carrier wave carrying information which represents the program, for example the aforementioned program, which for example comprises code means which are adapted to perform any or all of the steps of the method according to the first aspect.
- the signal wave is in one example a data carrier signal carrying the aforementioned computer program.
- a computer program stored on a disc is a data file, and when the file is read out and transmitted it becomes a data stream for example in the form of a (physical, for example electrical, for example technically generated) signal.
- the signal can be implemented as the signal wave, for example as the electromagnetic carrier wave which is described herein.
- the signal, for example the signal wave is constituted to be transmitted via a computer network, for example LAN, WLAN, WAN, mobile network, for example the internet.
- the signal, for example the signal wave is constituted to be transmitted by optic or acoustic data transmission.
- the invention according to the second aspect therefore may alternatively or additionally relate to a data stream representative of the aforementioned program, i.e. comprising the program.
- the invention is directed to a computer-readable storage medium on which the program according to the second aspect is stored.
- the program storage medium is for example non-transitory.
- the invention is directed to at least one computer (for example, a computer), comprising at least one processor (for example, a processor), wherein the program according to the second aspect is executed by the processor, or wherein the at least one computer comprises the computer-readable storage medium according to the third aspect.
- a computer for example, a computer
- the program according to the second aspect is executed by the processor, or wherein the at least one computer comprises the computer-readable storage medium according to the third aspect.
- the invention according to the fifth aspect is directed to a for example non-transitory computer-readable program storage medium storing a program for causing the computer according to the fourth aspect to execute the data processing steps of the method according to the first aspect.
- the disclosed method is not a method for treatment of the human or animal body by surgery or therapy.
- the invention does not involve or in particular comprise or encompass an invasive step which would represent a substantial physical interference with the body requiring professional medical expertise to be carried out and entailing a substantial health risk even when carried out with the required professional care and expertise.
- the invention does not involve or in particular comprise or encompass any surgical or therapeutic activity.
- the invention is instead directed as applicable to tracking of objects over time. For this reason alone, no surgical or therapeutic activity and in particular no surgical or therapeutic step is necessitated or implied by carrying out the invention.
- the present invention also relates to the use of the device/system or any embodiment thereof for tracking objects over time.
- the method in accordance with the invention is for example a computer-implemented method.
- all the steps or merely some of the steps (i.e. less than the total number of steps) of the method in accordance with the invention can be executed by a computer (for example, at least one computer).
- An embodiment of the computer implemented method is a use of the computer for performing a data processing method.
- An embodiment of the computer implemented method is a method concerning the operation of the computer such that the computer is operated to perform one, more or all steps of the method.
- the computer for example comprises at least one processor and for example at least one memory in order to (technically) process the data, for example electronically and/or optically.
- the processor being for example made of a substance or composition which is a semiconductor, for example at least partly n- and/or p-doped semiconductor, for example at least one of II-, III-, IV-, V-, Vl-sem iconductor material, for example (doped) silicon and/or gallium arsenide.
- the calculating or determining steps described are for example performed by a computer. Determining steps or calculating steps are for example steps of determining data within the framework of the technical method, for example within the framework of a program.
- a computer is for example any kind of data processing device, for example electronic data processing device.
- a computer can be a device which is generally thought of as such, for example desktop PCs, notebooks, netbooks, etc., but can also be any programmable apparatus, such as for example a mobile phone or an embedded processor.
- a computer can for example comprise a system (network) of "subcomputers", wherein each sub-computer represents a computer in its own right.
- the term "computer” includes a cloud computer, for example a cloud server.
- the term computer includes a server resource.
- the term "cloud computer” includes a cloud computer system which for example comprises a system of at least one cloud computer and for example a plurality of operatively interconnected cloud computers such as a server farm.
- Such a cloud computer is preferably connected to a wide area network such as the world wide web (WWW) and located in a so-called cloud of computers which are all connected to the world wide web.
- WWW world wide web
- Such an infrastructure is used for "cloud computing", which describes computation, software, data access and storage services which do not require the end user to know the physical location and/or configuration of the computer delivering a specific service.
- the term "cloud” is used in this respect as a metaphor for the Internet (world wide web).
- the cloud provides computing infrastructure as a service (laaS).
- the cloud computer can function as a virtual host for an operating system and/or data processing application which is used to execute the method of the invention.
- the cloud computer is for example an elastic compute cloud (EC2) as provided by Amazon Web ServicesTM.
- a computer for example comprises interfaces in order to receive or output data and/or perform an analogue-to-digital conversion.
- the data are for example data which represent physical properties and/or which are generated from technical signals.
- the technical signals are for example generated by means of (technical) detection devices (such as for example devices for detecting marker devices) and/or (technical) analytical devices (such as for example devices for performing (medical) imaging methods), wherein the technical signals are for example electrical or optical signals.
- the technical signals for example represent the data received or outputted by the computer.
- the computer is preferably operatively coupled to a display device which allows information outputted by the computer to be displayed, for example to a user.
- a display device is a virtual reality device or an augmented reality device (also referred to as virtual reality glasses or augmented reality glasses) which can be used as "goggles" for navigating.
- augmented reality glasses is Google Glass (a trademark of Google, Inc.).
- An augmented reality device or a virtual reality device can be used both to input information into the computer by user interaction and to display information outputted by the computer.
- Another example of a display device would be a standard computer monitor comprising for example a liquid crystal display operatively coupled to the computer for receiving display control data from the computer for generating signals used to display image information content on the display device.
- a specific embodiment of such a computer monitor is a digital lightbox.
- An example of such a digital lightbox is Buzz®, a product of Brainlab AG.
- the monitor may also be the monitor of a portable, for example handheld, device such as a smart phone or personal digital assistant or digital media player.
- the invention also relates to a computer program comprising instructions which, when on the program is executed by a computer, cause the computer to carry out the method or methods, for example, the steps of the method or methods, described herein and/or to a computer-readable storage medium (for example, a non-transitory computer-readable storage medium) on which the program is stored and/or to a computer comprising said program storage medium and/or to a (physical, for example electrical, for example technically generated) signal wave, for example a digital signal wave, such as an electromagnetic carrier wave carrying information which represents the program, for example the aforementioned program, which for example comprises code means which are adapted to perform any or all of the method steps described herein.
- the signal wave is in one example a data carrier signal carrying the aforementioned computer program.
- the invention also relates to a computer comprising at least one processor and/or the aforementioned computer- readable storage medium and for example a memory, wherein the program is executed by the processor.
- computer program elements can be embodied by hardware and/or software (this includes firmware, resident software, micro-code, etc.).
- computer program elements can take the form of a computer program product which can be embodied by a computer-usable, for example computer-readable data storage medium comprising computer-usable, for example computer-readable program instructions, "code” or a "computer program” embodied in said data storage medium for use on or in connection with the instruction-executing system.
- Such a system can be a computer; a computer can be a data processing device comprising means for executing the computer program elements and/or the program in accordance with the invention, for example a data processing device comprising a digital processor (central processing unit or CPU) which executes the computer program elements, and optionally a volatile memory (for example a random access memory or RAM) for storing data used for and/or produced by executing the computer program elements.
- a computer-usable, for example computer-readable data storage medium can be any data storage medium which can include, store, communicate, propagate or transport the program for use on or in connection with the instructionexecuting system, apparatus or device.
- the computer-usable, for example computer- readable data storage medium can for example be, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared or semiconductor system, apparatus or device or a medium of propagation such as for example the Internet.
- the computer- usable or computer-readable data storage medium could even for example be paper or another suitable medium onto which the program is printed, since the program could be electronically captured, for example by optically scanning the paper or other suitable medium, and then compiled, interpreted or otherwise processed in a suitable manner.
- the data storage medium is preferably a non-volatile data storage medium.
- the computer program product and any software and/or hardware described here form the various means for performing the functions of the invention in the example embodiments.
- the computer and/or data processing device can for example include a guidance information device which includes means for outputting guidance information.
- the guidance information can be outputted, for example to a user, visually by a visual indicating means (for example, a monitor and/or a lamp) and/or acoustically by an acoustic indicating means (for example, a loudspeaker and/or a digital speech output device) and/or tactilely by a tactile indicating means (for example, a vibrating element or a vibration element incorporated into an instrument).
- a computer is a technical computer which for example comprises technical, for example tangible components, for example mechanical and/or electronic components. Any device mentioned as such in this document is a technical and for example tangible device. Acquiring data
- acquiring data for example encompasses (within the framework of a computer implemented method) the scenario in which the data are determined by the computer implemented method or program.
- Determining data for example encompasses measuring physical quantities and transforming the measured values into data, for example digital data, and/or computing (and e.g. outputting) the data by means of a computer and for example within the framework of the method in accordance with the invention.
- a step of “determining” as described herein comprises or consists of issuing a command to perform the determination described herein.
- the step comprises or consists of issuing a command to cause a computer, for example a remote computer, for example a remote server, for example in the cloud, to perform the determination.
- the database or database used for implementing the disclosed method can be located on network data storage device or a network server (for example, a cloud data storage device or a cloud server) or a local data storage device (such as a mass storage device operably connected to at least one computer executing the disclosed method).
- the data can be made "ready for use” by performing an additional step before the acquiring step.
- the data are generated in order to be acquired.
- the data are for example detected or captured (for example by an analytical device).
- the data are inputted in accordance with the additional step, for instance via interfaces.
- the data generated can for example be inputted (for instance into the computer).
- the data can also be provided by performing the additional step of storing the data in a data storage medium (such as for example a ROM, RAM, CD and/or hard drive), such that they are ready for use within the framework of the method or program in accordance with the invention.
- a data storage medium such as for example a ROM, RAM, CD and/or hard drive
- the step of "acquiring data” can therefore also involve commanding a device to obtain and/or provide the data to be acquired.
- the acquiring step does not involve an invasive step which would represent a substantial physical interference with the body, requiring professional medical expertise to be carried out and entailing a substantial health risk even when carried out with the required professional care and expertise.
- the step of acquiring data does not involve a surgical step and in particular does not involve a step of treating a human or animal body using surgery or therapy.
- the data are denoted (i.e. referred to) as "XY data” and the like and are defined in terms of the information which they describe, which is then preferably referred to as "XY information" and the like.
- the n-dimensional image of a body is registered when the spatial location of each point of an actual object within a space, for example a body part in an operating theatre, is assigned an image data point of an image (CT, MR, etc.) stored in a navigation system.
- Image registration is assigned an image data point of an image (CT, MR, etc.) stored in a navigation system.
- Image registration is the process of transforming different sets of data into one coordinate system.
- the data can be multiple photographs and/or data from different sensors, different times or different viewpoints. It is used in computer vision, medical imaging and in compiling and analysing images and data from satellites. Registration is necessary in order to be able to compare or integrate the data obtained from these different measurements.
- a marker detection device for example, a camera or an ultrasound receiver or analytical devices such as CT or MRI devices
- the detection device is for example part of a navigation system.
- the markers can be active markers.
- An active marker can for example emit electromagnetic radiation and/or waves which can be in the infrared, visible and/or ultraviolet spectral range.
- a marker can also however be passive, i.e. can for example reflect electromagnetic radiation in the infrared, visible and/or ultraviolet spectral range or can block X-ray radiation.
- the position can for example be represented by the position of an X-ray beam which passes through the centre of said multiplicity or by the position of a geometric object (such as a truncated cone) which represents the multiplicity (manifold) of X-ray beams.
- Information concerning the above-mentioned interaction is preferably known in three dimensions, for example from a three-dimensional CT, and describes the interaction in a spatially resolved way for points and/or regions of the analysis object, for example for all of the points and/or regions of the analysis object.
- Knowledge of the imaging geometry for example allows the location of a source of the radiation (for example, an X-ray source) to be calculated relative to an image plane (for example, the plane of an X-ray detector).
- Fig. 4 is a schematic illustration of the system according to the first aspect
- Fig. 5 is an illustration of the system according to the first aspect.
- Fig. 1 illustrates the basic steps of the method according to the first aspect, in which step S11 encompasses acquiring image data, step S12 encompasses determining image section data, step S13 encompasses determining feature detection data, step S14 encompasses determining image subsection data, step S15 encompasses determining image subsection position data, step S16 encompasses determining feature subsection position data, step S17 encompasses determining image feature position data, step S18 encompasses acquiring relative feature position data, and step S19 encompasses determining marker position data.
- step S11 a high-resolution RGB-image is acquired via an optical camera 4 which is for example integrated in a head-mounted augmented reality device 7 (cf. Figure 5). Since the pixel count of this image is not suitable for further processing as will be described further below, the image resolution needs to be reduced without loosing image content, i.e. the depiction of the tracking marker 8 having four features 9 (cf. Figure 5). Therefore, in step S12 the camera image is first “resized” to a pixel count of 1280 x 960, wherein an image section having a pixel count 640 x 640 is “cropped” from the resized image.
- step S13 YOLOv5 is applied so as to detect depictions of features 9 of tracking marker 8 within the image section. Based on the machine-learning- model of YOLOv5, the position of features 9 within the image section is then detected and bounding boxes are defined in step S14 for each feature 9. The bounding boxes snugly fit around the respective features 9.
- each of these bounding boxes is subsequently calculated within the original high-resolution image retrieved from the camera 7.
- the bounding boxes calculated for the original image fully enclose a respective depiction of a feature 9
- the bounding boxes’ boundaries are enlarged by a predefined amount so as to ensure that the bounding boxes contain the depiction of the features 9 even when inaccuracies in image processing would mean that the actual position of the feature depictions deviates from the expected positions of the feature depictions.
- these depictions may, at least to some extent, lie beyond the boundaries of the bounding boxes, such that further processing based on the image content of the affected bounding boxes would be impaired.
- a subsequent step S16 the position of the bounding boxes is calculated within the image plane of the original image. Further, in subsequent step S17, the position of the feature depictions (i.e. the center of the depiction of the spherical features 9 in the shown example) is determined within the respective bounding boxes so as to eventually determine the position of each of the feature depictions within the original image.
- the spatial position (i.e. the spatial location and the spatial orientation) of the tracking marker 8 is calculated from the known position of the feature depictions in the original image acquired via the camera 4 and the expected spatial position of the features 9 with respect to each other, which is known from a model of a real-world tracking marker 8 stored on the storage device 6 (cf. Figure 4).
- Fig. 3 illustrates an embodiment of the present invention that includes all essential features of the invention.
- the entire data processing which is part of the method according to the second aspect is performed by a computer 2.
- Reference sign 1 denotes the input of data acquired by the method according to the second aspect into the computer 2 and reference sign 3 denotes the output of data determined by the method according to the second aspect.
- Fig. 4 is a schematic illustration of the tracking system 4 according to the first aspect.
- the system is in its entirety identified by reference sign 4 and comprises a computer 5, an electronic data storage device (such as a hard disc) 6 for storing at least the relative feature position data and a RGB-camera 7.
- the components of the tracking system 4 have the functionalities and properties explained above with regard to the first aspect of this disclosure.
- FIG. 5 shows a specific example of the tracking system according to the first aspect.
- a conventional optical tracking system 10 having a first coordinate system 11 is spatially disposed such that the field of view of its stereoscopic camera array 12 covers the spatial volume within which the tracking marker 8 along with its features 9 is present.
- a head-mounted augmented reality device 7 having a forward facing camera 4 and having a second coordinate system 12 is also provided.
- the inventive approach not only allows for aligning the first coordinate system 11 and the second coordinate system 12 based on detecting and tracking the spatial position of the tracking marker 8 with the conventional tracking system 10 and the headmounted augmented reality device 7, but even allows for replacing the conventional tracking system 10 with the head-mounted augmented reality device 7 for tracking the spatial position of the tracking member 8 and other trackable devices 13.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Medical Informatics (AREA)
- Health & Medical Sciences (AREA)
- Computer Graphics (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Image Analysis (AREA)
Abstract
L'invention concerne l'acquisition d'une image bidimensionnelle comprenant une représentation d'un objet, en particulier d'un marqueur de suivi, utilisant l'intelligence artificielle, en particulier un modèle d'apprentissage automatique pour extraire des informations de l'image bidimensionnelle de façon à déterminer finalement la position spatiale de l'objet par rapport à la caméra.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/EP2023/076869 WO2025067654A1 (fr) | 2023-09-28 | 2023-09-28 | Suivi monoscopique de marqueurs |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/EP2023/076869 WO2025067654A1 (fr) | 2023-09-28 | 2023-09-28 | Suivi monoscopique de marqueurs |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025067654A1 true WO2025067654A1 (fr) | 2025-04-03 |
Family
ID=88287454
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2023/076869 Pending WO2025067654A1 (fr) | 2023-09-28 | 2023-09-28 | Suivi monoscopique de marqueurs |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025067654A1 (fr) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20200229876A1 (en) * | 2010-03-03 | 2020-07-23 | Smith & Nephew, Inc. | Method for enabling medical navigation with minimised invasiveness |
| WO2022144116A1 (fr) * | 2020-12-31 | 2022-07-07 | Imec Vzw | Système de réalité augmentée, visiocasque à réalité augmentée et procédé de réalité augmentée et programme informatique |
-
2023
- 2023-09-28 WO PCT/EP2023/076869 patent/WO2025067654A1/fr active Pending
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20200229876A1 (en) * | 2010-03-03 | 2020-07-23 | Smith & Nephew, Inc. | Method for enabling medical navigation with minimised invasiveness |
| WO2022144116A1 (fr) * | 2020-12-31 | 2022-07-07 | Imec Vzw | Système de réalité augmentée, visiocasque à réalité augmentée et procédé de réalité augmentée et programme informatique |
Non-Patent Citations (5)
| Title |
|---|
| ROGER Y. TSAI: "A Versatile Camera Calibration Technique for High-Accuracy 3D Machine Vision Metrology Using Off-the-Shelf TV Cameras and Lenses", IEEE JOURNAL OF ROBOTICS AND AUTOMATION, vol. RA-3, no. 4, August 1987 (1987-08-01), pages 323 - 344 |
| ROGER Y. TSAI: "An Efficient and Accurate Camera Calibration Technique for 3D Machine Vision", PROCEEDINGS OF THE IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, 1986, pages 364 - 374, XP001004843 |
| SEBASTIAN VOGT ET AL: "Single Camera Tracking of Marker Clusters", PROCEEDINGS / INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED RALITY : ISMAR 2002 ; SEPTEMBER 30 - OCTOBER 1,2002, DARMSTADT, GERMANY, IEEEE COMPUTER SOCIETY, LOS ALAMITOS, CALIF. [U.A.], 30 September 2002 (2002-09-30), pages 127, XP058274982, ISBN: 978-0-7695-1781-0 * |
| TOTHHAJDER: "A Minimal Solution for Image-Based Sphere Estimation", INTERNATIONAL JOURNAL OF COMPUTER VISION, vol. 131, 2023, pages 1428 - 1447 |
| YANIV Z., JOSKOWICZ L., SIMKIN A., GARZA-JINICH M., MILGROM C.: "Computer-Assisted Intervention - MICCAI'98. MICCAI 1998. Lecture Notes in Computer Science", vol. 1496, 1998, SPRINGER, article "Fluoroscopic image processing for computer-aided orthopaedic surgery" |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20240245463A1 (en) | Visualization of medical data depending on viewing-characteristics | |
| US11759261B2 (en) | Augmented reality pre-registration | |
| US10987190B2 (en) | Generation of augmented reality image of a medical device | |
| EP3593226B1 (fr) | Navigation à réalité augmentée médicale | |
| EP3664737B1 (fr) | Enregistrement et suivi de patient basé sur vidéo | |
| US12263031B2 (en) | Determining a configuration of a medical x-ray imaging system for detecting a marker device | |
| US12406396B2 (en) | Microscope camera calibration | |
| WO2025067654A1 (fr) | Suivi monoscopique de marqueurs | |
| US12112437B2 (en) | Positioning medical views in augmented reality | |
| EP4196035B1 (fr) | Détermination d'une zone d'évitement pour un dispositif de référence | |
| IL310198A (en) | Conjunction of 2d and 3d visualisations in augmented reality |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23783730 Country of ref document: EP Kind code of ref document: A1 |