WO2024260371A1 - A device, process and system for diagnosing and determining the spinal alignment of a person - Google Patents
A device, process and system for diagnosing and determining the spinal alignment of a person Download PDFInfo
- Publication number
- WO2024260371A1 WO2024260371A1 PCT/CN2024/100127 CN2024100127W WO2024260371A1 WO 2024260371 A1 WO2024260371 A1 WO 2024260371A1 CN 2024100127 W CN2024100127 W CN 2024100127W WO 2024260371 A1 WO2024260371 A1 WO 2024260371A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- rgbd
- rci
- spine
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/103—Measuring devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
- A61B5/107—Measuring physical dimensions, e.g. size of the entire body or parts thereof
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/0033—Features or image-related aspects of imaging apparatus, e.g. for MRI, optical tomography or impedance tomography apparatus; Arrangements of imaging apparatus in a room
- A61B5/0035—Features or image-related aspects of imaging apparatus, e.g. for MRI, optical tomography or impedance tomography apparatus; Arrangements of imaging apparatus in a room adapted for acquisition of images from more than one imaging mode, e.g. combining MRI and optical tomography
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/0059—Measuring for diagnostic purposes; Identification of persons using light, e.g. diagnosis by transillumination, diascopy, fluorescence
- A61B5/0077—Devices for viewing the surface of the body, e.g. camera, magnifying lens
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/45—For evaluating or diagnosing the musculoskeletal system or teeth
- A61B5/4538—Evaluating a particular part of the muscoloskeletal system or a particular medical condition
- A61B5/4566—Evaluating the spine
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/48—Other medical applications
- A61B5/4887—Locating particular structures in or on the body
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/30—Determination of transform parameters for the alignment of images, i.e. image registration
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B2576/00—Medical imaging apparatus involving image processing or analysis
- A61B2576/02—Medical imaging apparatus involving image processing or analysis specially adapted for a particular organ or body part
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30008—Bone
- G06T2207/30012—Spine; Backbone
Definitions
- the present invention relates to a device, process and system for diagnosing and determining the spinal alignment of a person. More particularly, the present invention provides a device, process and system for diagnosing and determining the spinal alignment of a subject without the use of radiation.
- spine malalignment includes two patient cohorts:
- Scoliosis is a three-dimensional (3D) deformity of the spine defined as a Cobb angle 1 (CA: measured by an angle formed by the upper endplate of the uppermost tilted vertebra and the lower endplate of the lowest tilted vertebra of the structural curve) greater than 10 degrees on standing plain radiographs.
- CA Cobb angle 1
- AIS adolescent idiopathic scoliosis
- Untreated cases may progress rapidly during the pubertal growth spurt, causing body disfigurement, cardiopulmonary compromise, and back pain 4, 5 .
- Radiographic examination is the reference standard for quantify AIS severities and curve types 9 .
- AIS follow-ups and progression monitoring require repetitive radiographic examinations. 10
- Children are especially sensitive to radiation due to the higher metabolic activity of their cells 11 .
- accurate and radiation-free approaches 12 for spine alignment analysis is desirable.
- Non-radiation techniques for scoliosis assessment have been studied for years. Previous methods include 3-dimentional ultrasound 13 , digital inclinometer 14 , rasterstereography 15 and electrogoniometer 16 .
- the present invention is directed to providing a radiograph-comparable image (RCI) of a region of interest (ROI) of the spine of a patient or subject.
- RCI radiograph-comparable image
- ROI region of interest
- the RCI according to the present invention is a light-based RCI synthesis system, acquired from a Red Green Blue-Depth (RGBD) image of the naked spine of a patient.
- RGBD Red Green Blue-Depth
- this obviates necessity to expose a patient to radiation, and this is in particular important for people requiring repetitive images, such as in scoliosis, whereby children may need to be imaged regularly, or at least assessed regularly so as to determine the amount of progression of the treatment as well as the effectiveness of the treatment.
- the present invention provides computerized system for providing a radiograph-comparable image (RCI) of a region of interest (ROI) spinal region of a subject, said system comprising:
- a Red Green Blue-Depth (RGBD) standardization module for receiving an RGBD input image of the ROI of a subject and for providing a standardised RGBD image of said ROI;
- a back landmark detection module for detecting anatomic landmarks from the standardised RGBD image of said ROI;
- a landmark guided RCI synthesis module for providing a synthesized RCI of the ROI of the spine of the subject from the detected anatomical landmarks from the back landmark detection module and from the standardised RGBD image;
- Red Green Blue-Depth (RGBD) standardization module is implemented with rule-based and adaptive algorithms to standardize the images
- back landmark detection module and the landmark guided RCI synthesis module utilise computerised deep learning techniques
- RGBD input image of a region of interest of the spine of a subject is acquired by an RGBD image acquisition device for providing an RGBD input image.
- the system may further include a quantitative alignment analysis module, for analysing the alignment of the spine of the subject from the RCI, and wherein the quantitative alignment analysis module utilise computerised deep learning techniques.
- the alignment analysis module may provide a predictive analysis of the Cobb angle of the subject.
- the alignment analysis module may provide the severity of any spinal deformity of the subject, and the curve classification of the defomity of the spine of the subject.
- the system may further include a radiographic standardisation module which is implemented with rule-based and adaptive algorithms, wherein the radiographic standardisation module receives radiographic image of an ROI of a spine and provides a standardised X-ray, and wherein the RCI synthesis module further provides the synthesized RCI of the ROI of the spine of the subject from the standardised X-ray.
- the X-ray image may be an X-ray image of the ROI of the subject
- the input image is a two-dimensional (2D) image or a three-dimensional (3D) image.
- the system utilsies generative Artificial Intelligence (AI) is utilised to generate spine alignment using the RGBD input image.
- the spine alignment may be three-dimensional spine alignment.
- the present invention process operable on a computerized system for providing radiograph-comparable image (RCI) of the spinal region of a subject, said system comprising a Red Green Blue-Depth (RGBD) standardization module; a back landmark detection module and a landmark guided RCI synthesis module; ; wherein the Red Green Blue-Depth (RGBD) standardization module is implemented with rule-based and adaptive algorithms to standardize the images; and wherein the back landmark detection module and the landmark guided RCI synthesis module utilise computerised deep learning techniques, said process including the steps of:
- the system may further include a quantitative alignment analysis module, for analysing the alignment of the spine of the subject from the RCI, and wherein the quantitative alignment analysis module utilise computerised deep learning techniques.
- the alignment analysis module may provide a predictive analysis of the Cobb angle of the subject.
- the alignment analysis module may provide the severity of any spinal deformity of the subject, and the curve classification of the defomity of the spine of the subject.
- the system may further include a radiographic standardisation module which is implemented with rule-based and adaptive algorithms, wherein the radiographic standardisation module receives radiographic image of an ROI of a spine and provides a standardised X-ray, and wherein the RCI synthesis module further provides the synthesized RCI of the ROI of the spine of the subject from the standardised X-ray.
- the X-ray image may be an X-ray image of the ROI of the subject
- the input image is a two-dimensional (2D) image or a three-dimensional (3D) image.
- the system utilsies generative Artificial Intelligence (AI) is utilised to generate spine alignment using the RGBD input image.
- the spine alignment may be three-dimensional spine alignment.
- Figure 1a shows a schematic representation of a first embodiment of a system according to the present invention
- Figure 1b shows a schematic representation of a second embodiment of a system according to the present invention
- Figure 1c shows a schematic representation of a first embodiment of a process according to the present invention
- Figure 1d shows a schematic representation of deep learning approach to synthesize RCI from nude back images containing RGB and depth information (RGBD) , in accordance with the present invention
- Figure 1e shows a schematic representation of an application and evaluation of the AI system according to the present invention
- Figure 1f depicts the proportion of different severity levels and genders in all participants, in evaluation of the present invention
- Figure 1g shows an example of aligned RGB, depth and corresponding radiograph, according to the present invention
- Figure 2a shows statistical analysis and visual results of the landmarks detection on the bareback, for evaluation the present invention
- Figure 2b shows the statistics and error distribution for each landmark using violin plot, box plot, and scatter plot in Figure 2a
- Figure 3a and Figure 3b show normal-mild severity of curvature of the back of a subject
- Figure 3c and Figure 3d show moderate severity of curvature of the back of a subject
- Figure 3e and Figure 3f show severe severity of curvature of the back of a subject
- Figure 4a shows a regression plot to quantitatively assess the validity of the CA parameters estimated from the synthesized RCI, according to the present invention
- Figure 4b shows a Bland-Altman plot for performance on cobb angle prediction for the present invention
- Figures 4c - 4f show results of classification tasks, namely severity grading (3 types: normal-moderate, moderate, and severe) , curve detection, and the curve type prediction (2 types: T and TL/L) ;
- Figure 5 shows visual comparison of practical CA measuring in clinics using the original radiographs (left) and the synthesized RCI counterparts (right) , according to the present invention
- Figure 6 shows a sketch map of experimental settings for RGBD image collection, according to the present invention.
- Figure 7 shows the real experimental setting for RGBD image collection, according to the present invention.
- Figure 8a shows a representation of image acquisition for image standardization according to the present invention
- Figure 8c shows depth image processing
- Figure 9 shows the overview of the light-based RCI synthesis system, according to the present invention.
- Figure 10 shows the detailed architecture structure of HRNet backbone in the back landmark detection module, according to the present invention.
- Figure 11 shows architecture structure of ResNet_9blocks and 5-layer PatchGAN models in the RCI synthesis module, according to the present invention
- Figures 12a - 12f shows the Impacts of the selection of probability threshold on retrieval rate and distance metrics (MED and MMD) for 6 anatomical landmarks on bake image, according to the present invention
- Figures 13a-13f shows landmark detection learning curves for each proportion of training data, according to the present invention.
- Figures 14a -14f shows RCI synthesis learning curves for each proportion of training data, according to the present invention.
- Figures 15a and 15b shows Loss value on prospective testing dataset in terms of the subset proportion.
- FIG. 1a Figure 1b and Figure 1c, there is shown a schematic representation of an exemplary embodiment of a system and a process according to the present invention.
- FIG. 1a an embodiment of a system 100a of the present invention is shown, for providing a radiograph-comparable image (RCI) of a region of interest (ROI) spinal region of a subject.
- the system 100a includes:
- a Red Green Blue-Depth (RGBD) standardization module 110a for receiving an RGBD input image 102a of the ROI of a subject and for providing a standardised RGBD image 104a of said ROI;
- a back landmark detection module 120a for detecting anatomic landmarks from the standardised RGBD image 104b of said ROI.
- a landmark guided RCI synthesis module 130a for providing a synthesized RCI 132a of the ROI of the spine of the subject from the detected anatomical landmarks from the back landmark detection module and from the standardised RGBD image 104a;
- the Red Green Blue-Depth (RGBD) standardization module 110a is implemented with rule-based and adaptive algorithms to standardize the images.
- the back landmark detection module 120a and the landmark guided RCI synthesis module 130a utilise computerised deep learning techniques.
- An RGBD input image of a region of interest (ROI) of the spine of a subject is acquired by an RGBD image acquisition device for providing an RGBD input image 102a.
- ROI region of interest
- the system 100a also includes a quantitative alignment analysis module 140a in communication with the RCI synthesis modujel 130a, for analysing the alignment of the spine of the subject from the RCI, and wherein the quantitative alignment analysis module 140a utilises computerised deep learning techniques, and provides a predictive output 142a.
- a quantitative alignment analysis module 140a in communication with the RCI synthesis modujel 130a, for analysing the alignment of the spine of the subject from the RCI, and wherein the quantitative alignment analysis module 140a utilises computerised deep learning techniques, and provides a predictive output 142a.
- the predictive output can be the Cobb angle of the subject.
- the alignment analysis module 140a provides the severity of any spinal deformity of the subject, and the curve classification of the defomity of the spine of the subject.
- a further embodiment of a system 100b of the present invention is shown, for providing a radiograph-comparable image (RCI) of a region of interest (ROI) spinal region of a subject.
- the system 100b includes:
- a Red Green Blue-Depth (RGBD) standardization module 110b for receiving an RGBD input image 102 of the ROI of a subject and for providing a standardised RGBD image 104a of said ROI;
- a back landmark detection module 120b for detecting anatomic landmarks from the standardised RGBD image 104b of said ROI.
- a landmark guided RCI synthesis module 130b for providing a synthesized RCI 132b of the ROI of the spine of the subject from the detected anatomical landmarks from the back landmark detection module and from the standardised RGBD image 104b;
- the Red Green Blue-Depth (RGBD) standardization module 110b is implemented with rule-based and adaptive algorithms to standardize the images.
- the back landmark detection module 120b and the landmark guided RCI synthesis module 130b utilise computerised deep learning techniques.
- An RGBD input image of a region of interest (ROI) of the spine of a subject is acquired by an RGBD image acquisition device for providing an RGBD input image 102b.
- ROI region of interest
- the system 100a also includes a quantitative alignment analysis module 140b in communication with the RCI synthesis modujel 130b, for analysing the alignment of the spine of the subject from the RCI, and wherein the quantitative alignment analysis module 140b utilises computerised deep learning techniques, and provides a predictive output 142b.
- the system 100b furgter includes radiographic standardisation module 150b which is implemented with rule-based and adaptive algorithms, wherein the radiographic standardisation module 150b receives radiographic image 152b of an ROI of a spine and provides a standardised X-ray 154b, and wherein the RCI synthesis module 130b further provides the synthesized RCI of the ROI of the spine of the subject from the standardised X-ray 154b.
- FIG. 1c there is shown an exemplary embodiment of a process 100c according to the present invention.
- the process 100c provides a radiograph-comparable image (RCI) of the spinal region of a subject, using a system comprising a Red Green Blue-Depth (RGBD) standardization module; a back landmark detection module and a landmark guided RCI synthesis module.
- RCI radiograph-comparable image
- the Red Green Blue-Depth (RGBD) standardization module is implemented with rule-based and adaptive algorithms to standardize the images; and wherein the back landmark detection module and the landmark guided RCI synthesis module utilise computerised deep learning techniques, said process including the steps as as follows:
- Step (i) 110c -acquiring an RGBD input image of a region of interest of the spine by way of an RGBD image acquisition device;
- Step (iv) 140 analysing the alignment of the spine of the subject from the RCI, by way of the quantitative alignment analysis module.
- the present inventor has sought to accurately quantify AIS spine malalignment without using any radiation techniques.
- Light-Based Radiograph-Comparable Image (RCI) Synthesis was explored, and a deep learning approach to synthesize RCI from nude back images containing RGB and depth information (RGBD) was developed, as shown in Figure 1d) , which is implemented in a similar manner as described above in relation to Figure 1a and Figure 1d.
- the models generate synthesized RCI containing anatomical morphology information of the spine to accurately quantify spinal alignment.
- the reliability of models of the present invention prospectively validated with multiple tasks in two clinics, including back landmark auto-detection, RCI synthesis, and scoliosis severity and curve type classification.
- Technology of the present invention has the potential to facilitate radiation-free, fast, and accurate AIS analysis.
- the system 100d includes a pipeline consisting of 1) a RGBD and radiograph standardization module 110d, 2) a back landmark detection module 120d, 3) a landmark guided RCI synthesis module 130d and 4) a quantitative alignment analysis module 140d.
- the module 110d first is implemented with rule-based and adaptive algorithms to standardize the images, while the last three modules, modules 1230d, 130d and 140d adopt deep learning techniques.
- FIG. 1e shows Application and evaluation of the AI system.
- RGBD images captured with the smartphone and equipment of the present invention are transmitted to the cloud data centre and backend AI server that hosting the light-based RCI synthesis and AlignPro system for analysis. Then, the results can be instantly transmitted and displayed back to the smartphone and equipment.
- AlignPro TM is a robust deep learning-based prediction of spinal alignments irrespective of image qualities acquired from smartphone photographs of radiographs displayed on PACS (picture archiving and communication system) , by Conova Medical Technology Limited.
- RCI radiograph-comparable image
- RGB Red Green Blue
- RGBD Red Green Blue-Depth
- AI artificial intelligence
- Adolescent idiopathic scoliosis is the most common type of spinal disorder affecting children. Clinical screening and diagnosis require physical and radiographic examinations, which are either subjective or increase radiation exposure.
- a radiation-free portable system and device developed and validated a utilising light-based depth sensing and deep learning technologies to analyse AIS by landmark detection and image synthesis.
- RGBD Red Green Blue-Depth
- GT ground truth
- the prediction accuracy was of the model on nude back landmark detection was evaluated as well as the performance on radiograph-comparable image (RCI) synthesis.
- the obtained RCI contains sufficient anatomical information that can quantify disease severities and curve types.
- the model of the present invention had a consistently high accuracy in predicting the nude or naked back anatomical landmarks with a less than 4-pixel error regarding the mean Euclidian and Manhattan distance.
- the synthesized RCI for AIS severity classification achieved a sensitivity and negative predictive value of over 0.909 and 0.933, and the performance for curve type classification was 0.974 and 0.908, with spine specialists’ manual assessment results on real radiographs as GT.
- the radiation-free medical device of the present invention powered by depth sensing and deep learning techniques can provide instantaneous and harmless spine alignment analysis which has the potential for integration into routine screening for adolescents.
- the dataset includes patients from two scoliosis clinics: 1) The Duchess of Kent Children’s Hospital and, 2) Queen Mary Hospital in Hong Kong.
- the sex of each participant was determined in terms of physiological characteristics (according to the information on his/her ID card) .
- a bareback RGBD image and a whole-spine standing posteroanterior radiograph were acquired.
- the RGBD imaging system consists of an RGBD camera 25 , a portable computer and a self-designed mobile stand as shown in Figure 6 and y Figure 7.
- the radiograph was acquired using the EOS TM ( Imaging, Paris, France) biplanar stereoradiography machine.
- the proposed system consists of four modules as shown in Figure. 1a: 1) a RGBD and radiograph standardization module, 2) a back landmark detection module, 3) a landmark guided RCI synthesis module, and 4) a quantitative alignment analysis module.
- Image pre-processing was conducted to standardize the images and resolve the misalignment of RGBD images and radiographs obtained from different imaging systems.
- a novel image registration algorithm was developed for this purpose to align the RGBD and radiographs according to the landmarks of the 7 th cervical vertebra (C7) and the tip of coccyx (TOC) in radiographs and RGBD images.
- Random sample consensus has been used to standardize the image capture angle.
- Each captured image is standardized according to the checkboard plane which is placed perpendicular to the ground and beside the patient as shown in Figure. 8a.
- both aligned radiograph and RGBD image were cropped to a 512 ⁇ 256 patch, only containing the back region as shown in Figure 8b.
- the severity of spine deformity was classified into 3 categories according to clinical standard 26, 27 . Curves with a CA larger than 40° were classified as severe, between 20° and 40° were considered as moderate, between 0° and 20° were considered as normal-mild.
- the deformity severity assessment standard and clinical management are presented in Table 1.
- the curve type was regarded as thoracic (T) if the curve apex was between the 1 st to the 11 th thoracic vertebrae and considered as thoracolumbar or lumbar (TL/L) if the curve apex was between the 12 th thoracic vertebra and the 5 th lumbar vertebrae.
- the light-based RCI synthesis system consisted of a back landmark detection module and an RCI synthesis module as shown in Figure 9.
- the back landmark detection module adopted the modified HRNet backbone as shown in Figure 10 to predict the 6 anatomical landmarks. It utilised different branches to extract multi-scale features from the images and then integrated these features to achieve better model outputs.
- the number of channels output by the module was 6 and each channel was a heatmap, providing the probability of the position of the landmark.
- the RCI synthesis module adopted the CycleGAN framework but used ResNet as the generator. It took the concatenation of the RGBD images (4-channels) and the 6 detected landmark heatmaps (6-channels) , in total 10-channel data, as input and outputs RCI image. The details of the ResNet model and PatchGAN model can be found in Figure 11.
- the quantitative alignment analysis module utilised the online platform (AlignPro 27 ) for automatic CA prediction. Both original radiographs and generated RCIs were sent to the server of AlignPro and then the endplate landmarks of the end vertebrae located in different spinal curves could be obtained.
- the obtained landmarks were further analysed and adjusted by senior clinicians to eliminate noise in the automatic predictions and fill the missing landmarks without reference to the original radiographs.
- the final output CAs from this module were calculated according to the landmarks reviewed by the clinicians using the AlignPro platform.
- the RGBD imaging device mainly consists of an RGBD camera, a portable computer, and a self-designed mobile stand.
- the RGBD camera is an Azure Kinect DK camera which is used to record both appearance and depth information of the nude back.
- a portable computer (with Windows OS, an Intel Core i5 CPU and 16GB memory) is connected to the camera and users can operate the camera for RGBD image capturing and archiving through self-developed software of the present invention. Both the camera and computer are installed on a portable stand for the convenience of users. The appearance of device for the data collection is shown in Figure 1e.
- MED Mean Euclidean distance
- MMD mean Manhattan distance
- Fréchet inception distance (FID) 28 Fréchet inception distance (FID) 28
- VIF Visual information fidelity
- LPIPS Learned perceptual image patch similarity
- NIQE Natural image quality evaluator
- BRISQUE Blind/referenceless image spatial quality evaluator
- FIG. 2a there is shown statistical analysis and visual results of the landmarks detection on the bareback.
- b Depth and combined heatmaps of the 6 bareback anatomical landmarks for landmark detection.
- 3 samples for each disease severity (normal-mild, moderate and severe) are presented.
- Left side presents the contour plot of the depth image of each patient with predicted and GT landmarks.
- the colourbar demonstrates the height of the surface measured in millimeter in terms of the height of landmark C7. The higher the region is, the closer the region to the camera (e.g., red means closer to the camera while blue means away from the camera) .
- Right side displays the RGB image of the patient as background, and the landmark heatmap as foreground.
- the colourbar denotes the probability of the appearance of the landmarks (e.g., red means higher probability and blue means lower probability) .
- the violin plot shows the distribution of the error values.
- the box plot is defined as a graphical representation of the distribution of the data, with the box representing the interquartile range, the line inside the box denoting the median, and the whiskers representing the range of data.
- the violin plot provides an intuitive view of the error distribution.
- the region smaller than zero was cropped due to the nonnegativity of MED and MMD. Meanwhile, the region larger than the 95 th percentile was also cropped to ignore the outliers.
- the scatter plot shows how the error data points distributed along the y-axis.
- FIG. 4 there is now shown performance evaluation of the model on CA prediction of the present invention.
- a Linear regression analysis of CAs on the prospective test data The x-axis denotes the CA degrees predicted from synthesized RCIs of the present invention, while the y-axis denote the CA degrees obtained from GT radiographs.
- b Bland-Altman plots assessing the agreement of CAs obtained from synthesized and GT radiographs on the prospective test data.
- the x-axis represents the average degree of CAs obtained from synthesized and GT Radiographs, while y-axis denotes the difference between them.
- c confusion matrix for the severity grading (3 types: normal-mild, moderate, and severe) .
- d-e confusion matrices for detection of the presence T and TL/L curves, respectively.
- f confusion matrix for major curve type prediction.
- CA Cobb angle
- RCI radiograph-comparable image
- GT ground truth
- T thoracic
- TL/L thoracolumbar/lumbar
- SD standard deviation.
- the p-value helps to determine the correlation between the predicted CA using real radiographs and RCIs, and the small value (p-value ⁇ 0.0001) indicates they have strong correlation.
- TP, TN, FP, and FN refer to true positive, true negative, false positive, and false negative predictions, respectively. All the statistical analysis has been done using Python (v3.8) and several python packages, including Numpy (v1.18.5) , SciPy (v1.5.2) , Ptitprince (v0.2.6) , pandas (v1.1.3) , seaborn (v0.11.0) , and Matplotlib (v3.3.2) .
- the sub-sampling sample size determination method has been used to examine how sample size affects the two models of the present invention.
- SSDM's application to deep learning in medical imaging is still in its early stages, and there is no suitable SSDM to the problem studied in study.
- a practical SSDM was chosen, i.e., a curve-fitting method 33 , in this paper to empirically assess the model’s effectiveness at various sample size proportions.
- the training data was randomly sub-sampled by a proportion factor of 4%, 8%, 16%, 32%, 64%, 100%, and for each factor, the models were trained 10 times as shown in Figure 13 and Figure 14.
- the weights from the training history with the minimal validation loss were stored and then assessed on the prospective testing dataset, yielding 10 test loss estimates for each proportion factor as shown in as shown in Figure 15.
- the model performance was improved.
- the sampling proportion should be larger than 32%to ensure the model performance and stability while for RCI synthesis model, all training samples should be used to achieve the best model performance.
- CA Cobb angle
- SD standard deviation
- BMI body mass index
- T thoracic
- TL/L thoracolumbar/lumbar.
- MED mean Euclidean distance
- MMD mean Manhattan distance
- SD standard deviation
- C7 7 th cervical vertebra
- PIIS posterior inferior iliac spine
- TOC tip of coccyx.
- P1 - P6 correspond to the landmarks in Fig. 2.
- a high value means the probability of the landmark located at this corresponding position is high and vice versa.
- a probability threshold to filter the heatmaps to evaluate the quality of the output heatmaps was used. For different threshold values (from 0.1 to 0.6) , HRNet achieved highest landmark retrieval rate for all 6 landmarks, and the retrieval rate was relatively stable when changing the threshold value compared with other architectures.
- Inference memory means the required memory during the model testing.
- C7 the 7th cervical vertebra
- PIIS posterior inferior iliac spine
- TOC tip of coccyx
- PIIS posterior inferior iliac spine
- FIGS 3a-3f there is shown RCI synthesis results from the RCI synthesis module.
- Each subfigure presents a case with a certain severity level and curve type of scoliosis. From left to right, each subfigure presents the RGB image of the patient’s nude back, depth image of the patient’s nude back, the GT radiograph of the spine region and the synthesized RCI.
- Figure 3a and Figure 3b exhibit the RCI synthesis results for normal-mild cases.
- Figure 3c and Figure 3d show the moderate cases. Among them, the patient in Fig. c has both T and TL curve while the one in Fig. d has only TL curve.
- Figure 3e and Figure 3f present the severe cases. Among them, the patient in has T curve while the patient in Fig. f has both T and TL curve.
- the pixel-wise metrics such as peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) were unstable and inconsistent with the other metrics, and thus were not suitable to be used as measurements for RCI synthesis, after comparison of 12 combinations of generators and discriminators y Table 7 in terms of 7 quantitative image quality metrics.
- PSNR and SSIM the other 5 metrics, namely, 1) FID 28 , 2) VIF 29 , 3) LPIPS 30 , 4) NIQE 31 , and 5) BRISQUE 32 , performed consistently and indicated that a ResNet 35 with 9 blocks as generator and a 5 convolutional layer PatchGAN 36 as discriminator achieved the best performance.
- UNet architecture for input 128x128, with 7 different scales of features while “UNet_256” means UNet architecture (for input 256x256) with 8 different scales of features.
- 5 layers means the PatchGAN architecture with 5 convolutional layers
- 8 layers means the PatchGAN architecture with 8 convolutional layers
- the PixelGAN refers to a 5-layer CNN with PixelGAN architecture.
- FIG. 5 there is shown a visual comparison of practical CA measuring in clinics using the original radiographs (left) and the synthesized RCI counterparts (right) .
- the first row presents normal-mild cases with single or double spine curves.
- the second row presents moderate cases with single or double spine curves.
- the third row presents severe cases with double or triple spine curves.
- the patient number is printed on the top left corner on the GT radiograph to indicate different patients.
- the curve type and CA measuring results are printed on the bottom of each radiograph.
- Abbreviation definition CA: Cobb angle; RCI: radiograph-comparable image; GT: ground truth.
- NPV negative predictive value
- T thoracic
- TL/L thoracolumbar/lumbar
- RGBD images were collected using the Microsoft Azure Kinect DK camera 38 . This camera can record both RGB images and the depth image of the scene simultaneously with one capture.
- the radiographs were obtained from medical machine EOS TM .
- the machine generated two paired radiographs, i.e., an anteroposterior radiograph and a lateral radiograph. Only the anteroposterior radiograph was included in the dataset as the focus was on the coronal Cobb angle measurement.
- the dataset is being collected and enlarged continuously and at the moment of preparing this work, the dataset (at the time of manuscript submission) contains 2, 238 patients. These data were collected between October 2019 and May 2022 in two centers, i.e., The Duchess of Kent Children’s Hospital and Queen Mary Hospital in Hong Kong. All the collected RGBD and radiographs are assessed manually by the specialists and experts according to the standard presented in Table 1.
- the RGBD imaging system mainly consists of a consumer-grade RGBD camera, i.e., Microsoft Azure Kinect DK camera 38 , a portable computer and a self-designed mobile stand.
- the whole imaging system was deployed in the spine clinic of the study as shown in Figure 6 and Figure 7.
- the height of the camera is set to 1 meter and patients were instructed to expose their back with 1.2 meters distance from the camera, standing on the patient anchor.
- the self-developed RGBD standardization algorithm embedded in equipment of the present invention can automatically process the original RGBD images from the camera and output the standardized RGBD images.
- the radiograph was acquired using the EOS TM ( Imaging, Paris, France) biplanar stereoradiography machine. Patients were required to remove the metallic accessories before being led into the scanning cabin of the machine. Both anteroposterior (AP) and lateral radiographs were scanned simultaneously with a scanning speed of 4.6 cm/s. The radiation dose for AP radiograph is 143.29 mGy. cm 2 and the scanning region is ⁇ 22.4 cm which contains the patient body.
- the AP radiographs were collected using the software of the machine and anonymized before sending to radiograph standardization module for image pre-processing to obtain standardized radiographs.
- FIG. 6. presents the sketch map of the experimental settings for RGBD image collection and Figure 7 provides the corresponding real experimental settings.
- the Azure Kinect DK camera was installed on a self-developed mobile stand of the present invention and the height of the camera lens was about 0.95 meters (0.95 ⁇ 0.03m) .
- the patient was required to stand right ahead of the camera.
- an anchor (patient anchor) was placed on the ground to denote the standing point of the patient, and each patient was required to stand on this standing point when capturing the RGBD image.
- the distance between the anchor and camera lens was about 1.2meters (1.2 ⁇ 0.05m) to ensure both RGB camera scope (red rectangle in Figure 6) and depth camera scope (green hexagon in Figure. 6) contained the patient body.
- the Azure Kinect DK camera was connected to a computer with the Azure Kinect SDK software 39 installed to collect and standardize the image. During the image collection, all research assistants were instructed and required to capture a short video (1-2 seconds) of the patient nude back. A pair of RGB and depth images were extracted from the same frame of each short video.
- the radiographs were captured using the EOS TM biplanar stereoradiography machine ( Imaging, Paris, France) .
- a pair of anteroposterior and lateral radiographs were captured each time and only the anteroposterior radiograph was collected for this dataset.
- the images were deidentified before being sent to the researchers.
- anatomical landmarks were selected that have been proved to be effective for diagnosis of adolescent idiopathic scoliosis (AIS) in clinical applications 40, 41, ] . These landmarks included: 1) the 7 th cervical vertebra (C7) , 2) left inferior scapular angle, 3) right inferior scapular angle, 4) left posterior inferior iliac spine (PIIS) , 5) right PIIS, and 6) the tip of coccyx (TOC) . All 6 landmarks in each RGB bareback image were manually labelled by senior surgeons with over 20 years’ clinical experience. Those landmarks were used as the ground truth (GT) to train the AI model of the present invention used in the landmark detection module.
- GT ground truth
- two landmarks on each X-ray image were annotated, i.e., C7 and tip of coccyx.
- the selection of these two landmarks has undergone careful consideration.
- the results of landmark detection as shown in Figure 2a also demonstrate this. That is, the detection of C7 and the tip of coccyx achieves minimal error and variance in terms of mean Euclidean distance (MED) and mean Manhattan distance (MMD) .
- MED mean Euclidean distance
- MMD mean Manhattan distance
- the two landmarks in each X-ray image were also manually labelled by senior surgeons with over 20 years’ clinical experience.
- the cross-modality registration algorithm was designed to utilize these two landmarks to match each pair of RGB bareback image and X-ray image.
- the image standardization module consists of RGBD standardization module and radiograph standardization module.
- the RGBD standardization module matches each pair of RGB and depth images, normalize the depth image and crops the back region.
- the raw images were pre-processed using the inherent function (k4a_transformation_depth_image_to_color_camera) of Azure Kinect SDK to map the depth camera scope to RGB camera scope.
- k4a_transformation_depth_image_to_color_camera the inherent function of Azure Kinect SDK to map the depth camera scope to RGB camera scope.
- the depth image was further processed to remove the background and keep the human foreground.
- the histogram of the depth image counts the depth value of each pixel position. According to the experimental results, there are two summits in the histogram plot, i.e., the first one is for foreground and the second one is for background. the depth value belonging to the second summit in the depth image is removed and the depth value to the range [0, 1] is normalised for the convenience of model training. Finally, the back region in both RGB and depth images were cropped according to the 6 landmarks.
- the detailed RGBD standardization pipeline is presented in Figure 8.
- the X-ray standardization module aligns the X-ray image with the RGB image according to the annotations. Since C7 and TOC have been annotated in both RGB images and X-ray images, they can be used as reference key points to align the RGB and X-ray images, so that the back region can be largely overlapped in both images.
- the algorithm pipeline is presented in Figure 8a and Figure 8b.
- the back landmark detection module adopts HRNet 42 backbone for the anatomical landmark detection.
- HRNet backbone for the anatomical landmark detection.
- the detailed structure of HRNet for end vertebra detection and landmark detection is presented in Figure 10.
- the convolutional layers were used for feature extraction and presentation learning.
- the hyperparameters of each convolutional layer are presented in the parenthesis with format: “c#k#s#p#” , where #denotes a digital number.
- the layer Conv2d (c64k3s2p1) means the convolutional layer with 64 filters, 3 ⁇ 3 kernel size, 2 stride, and 1 zero-padding size. Different colors are used to denote different types of layers or composite layers. For composite layers, the detailed contents are listed beside the network structure in the figure.
- the batch-normalization layer (denoted as BatchNorm2d) normalizes the input features through the equation:
- the activation layer adopted in the network is the widely used rectified linear unit (ReLU) layer which activates the outcome of the previous layer through the following equation:
- the direct outputs of the 4 deep learning models are landmark heatmaps.
- the final predicted position of each landmark is the position of the highest value in the corresponding landmark heatmap.
- the values in a landmark heatmap are between 0 and 1 which can be considered as the probability of the landmark position. Therefore, a lower value in the landmark heatmap means a lower probability for the landmark located at that position, and vice versa. Therefore, for landmark detection, though the precision of the landmark location is important, the probability distribution of each landmark is equally crucial.
- a threshold to filter the heatmap results is used, and then the position of highest value on the filtered heatmap is detected. When increasing the threshold value, some landmarks’ position with low probability (smaller than the threshold) can be filtered out, and thus these landmarks cannot be retrieved.
- the retrieval rate is counted in terms of the probability threshold for each of the 6 landmarks and the results are presented in Table 6. As shown, compared with other deep learning architectures, HRNet performs consistently better on detection of all the 6 landmarks. Even when the threshold is set to 0.6, HRNet still achieves a good landmark retrieval rate.
- the predicted location of the landmark with a low probability is not accurate and may have negative impact on the disease analysis.
- the simplest method is to use a threshold to filter out the bad results with low probability, but if the value of threshold is too high, some good predictions can be filtered out which will deteriorate the model performance. As a result, there is a trade-off between the probability thresholds and model performance.
- a plot of the curve between the retrieval rate/MME/MED and probability threshold in Figure 12 is used. According to the figure, we 0.6 is selected as the probability threshold to filter out the bad predictions. With this threshold value, the HRNet used in the present invention can still retrieve over 90%of the landmarks and the predicted landmarks with low probability can be removed as well.
- the RCI synthesis module consists of a generator and a discriminator.
- the generator adopts a ResNet backbone with 9 residual blocks while the discriminator uses the PatchGAN framework, and the model consists of 5 convolutional layers.
- the performance of the generator is not only related to its own structure, but also closely related to the discriminator. Therefore, ablation studies are conducted to evaluate different combinations and architectures of generator and discriminator to obtain the optimal generation results.
- Table 7 evaluates the RCI synthesis results using several quantitative metrics. There are 7 commonly used quantitative measurements adopted in the table to indicate the synthesized results of generator.
- PSNR peak signal-to-noise ratio
- FID Frechet inception distance
- VIF visual information fidelity
- LPIPS trained perceptual image patch similarity 50 measures the l 2 distance between the AlexNet activations of the generated images and ground truth images.
- NIQE (natural image quality evaluator) 51 is a no-reference image quality metric which utilizes measurable deviations from statistical regularities observed in natural images. Mathematically, it calculates the Mahalanobis distance between two multi-variate Gaussian models of 36 features extracted from the generated images.
- BRISQUE (blind/referenceless image spatial quality evaluator) 52 is another no-reference image quality metric which extracts the features from a Gaussian model of mean subtracted contrast normalized coefficients obtained from the generated images. These features are then fed into SVM to acquire the quality score.
- the selected 7 metrics evaluate the model performance in multiple aspects. From Table 7, the pixel-wise quality metrics such as PSNR and SSIM are inappropriate to evaluate the quality of the synthesized RCI, since pixel-level accurate reconstruction of the ground truth X-rays is not pursued. Also, from the results, it can be see that the best results in terms of PSNR and SSIM are totally different, which also reflects that these two metrics are not stable to evaluate the results in this task. On the contrary, the other distribution-based and feature-based image quality metrics, i.e., VIF, FID, LPIPS, NIQE, and BRISQUE are performed much consistently.
- An ablation study exams 4 generators and 3 discriminators, that is in total 12 different combinations for the GAN framework.
- 2 architectures were studied, namely ResNet and UNet with different hierarchical structures (6 and 9 blocks for ResNet architecture; 7 and 8 scale levels for UNet architecture) .
- 2 architectures were also studied, namely PatchGAN and PixelGAN.
- the PatchGAN is a model which outputs a matrix of bool values, and each value corresponds to a small patch in the original image.
- PixelGAN is a special case of PatchGAN whose output bool matrix is with the same spatial size as the input image. Therefore, each value in the output matrix corresponds to a single pixel in the input image.
- the discriminator adopts PatchGAN outperforms the one adopts PixelGAN
- the generator adopts ResNet outperforms the one adopts UNet. From Table 7, it can be seen that the generator with a ResNet-9blocks architecture combined with a discriminator with 5 convolutional layers PatchGAN architecture achieves the best performance.
- MIS Medical image synthesis
- an AI-powered system has the potential to synthesize RCI from optical images (RGBD images) .
- RGBD images optical images
- This finding can potentially provide a radiation-free screening and analysis technique to assist AIS detection and diagnosis.
- the experimental results show that the analytic results obtained from synthesized RCIs are coherent with the results from real radiographs (GT) .
- GT real radiographs
- the developed light-based RCI synthesis system has the capacity to generate reliable RCI to substantively assist the clinical diagnosis procedure for AIS.
- the bareback landmark detection module was designed to output the heatmaps of the 6 bareback landmarks, indicating both the landmark position and probability of its presence.
- the validity and generalization of the bareback landmark detection module was examined on the prospective test dataset quantitatively and statistically, and it achieved less than 4 pixels error on average for each back landmark in terms of both MED and MMD measurements (Table 2) .
- the model can accurately predict the landmarks from the nude back images of patients with different severity levels of AIS as shown in Figure 3b.
- Supplementary Table 3 compares the performance of different models for landmark detection. As shown, HRNet outperforms the other 3 deep learning models by a large margin, especially for the detection of left and right inferior scapular angle landmarks. Combining Supplementary Table 2 and Supplementary Table 3, HRNet uses a modest size of inference memory and relatively smaller computations (FLOPs) to achieve the best performance.
- a threshold as used to filter the heatmap results.
- some landmark positions with low probability small than the threshold
- Supplementary Table 4 illustrates the retrieval rate under different probability threshold for each of the 6 landmarks.
- HRNet performs consistently better on detection of all 6 landmarks with different probability threshold.
- a landmark located at the position with a low probability value is usually not accurate.
- the probability threshold can help to filter out such inaccurate predictions.
- the threshold is too high, some good predictions can be filtered out which will deteriorate the model performance.
- the curve between the retrieval rate/MME/MED and probability threshold was plotted. As shown in Figure 12, 0.6 was selected as the probability threshold to filter out bad predictions. With this threshold, the HRNet can still retrieve over 90%of the landmarks and the predicted landmarks with low probabilities can be removed as well.
- the quality of synthesized RCIs were assessed in two aspects, namely the image quality and the usability of the synthesized images in analytic clinical applications. Given the misalignment between RGBD images and radiographs, it was not appropriate to use pixel-wise image quality metrics (e.g., PSNR and SSIM) , to evaluate the model performance. In this study, 5 metrics were introduced which measure the quality of the synthesized image in terms of the distribution or properties of extracted image features (FID, LPIPS and NIQE) , image fidelity (VIF) , and image spatial quality (BRISQUE) . Supplementary Table 5 compares the performance of different metrics on the synthesized RCIs. As shown, both PSNR and SSIM had low values, and the results were not consistent. In comparison, the measuring results obtained in terms of FID, LPIPS, NIQE, VIF, and BRISQUE were consistent, especially for the best two combinations of the generator and discriminator.
- pixel-wise image quality metrics e.g., PSNR and SS
- the clinical quality and usability of synthesized RCIs were assessed in multiple analytic clinical applications, including the severity grading, curve type identification, and CA prediction.
- moderate patients take up about 60%of the participants.
- the second largest population is normal-mild (about 30%)
- severity group is the minority (about 10%) .
- clinicians can distinguish the moderate and normal-mild patients with high sensitivity (>95%) , while distinguishing the severity patients with relatively lower sensitivity (0.909) , as shown in Table 3. Even so, the severity grading results are still satisfactory in terms of the high accuracy, and the performance on curve type classification can also validate the good quality of the synthesized RCIs.
- the performance of the CA prediction largely depends on the quality of the synthesized RCIs which further relies on the accuracy of the detected anatomical landmarks. Considering the nude back features of obese patients may not be obvious enough to obtain accurate landmarks, no overweight patients are recruited in this study.
- Another potential limitation is skin colour. Since most of the participants are Southeast Asian Chinese, the images in the dataset used for training model may not be able to accurately represent the skin colour of other population groups.
- the investigators deployed the first prospectively tested auto-alignment analysis model for spine malalignment analysis using deep learning and RGBD technologies with no radiation.
- our platform can better assist clinicians and clinical research in large volumes.
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Veterinary Medicine (AREA)
- Biophysics (AREA)
- Pathology (AREA)
- Biomedical Technology (AREA)
- Heart & Thoracic Surgery (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- Surgery (AREA)
- Animal Behavior & Ethology (AREA)
- Dentistry (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Physical Education & Sports Medicine (AREA)
- Orthopedic Medicine & Surgery (AREA)
- Rheumatology (AREA)
- Radiology & Medical Imaging (AREA)
- Image Analysis (AREA)
Abstract
Disclosed is a computerized system (100a) for providing a radiograph-comparable image (RCI) of a region of interest (ROI) spinal region of a subject. The system (100a) comprises a Red Green Blue-Depth (RGBD) standardization module (110a), for receiving an RGBD input image (102a) of the ROI of a subject and for providing a standardised RGBD image (104a) of the ROI; a back landmark detection module (120a), for detecting anatomic landmarks from the standardised RGBD image (104b) of the ROI; and a landmark guided RCI synthesis module (130a), for providing a synthesized RCI (132a) of the ROI of the spine of the subject from the detected anatomical landmarks from the back landmark detection module (120a) and from the standardised RGBD image (104a). The Red Green Blue-Depth (RGBD) standardization module (110a) is implemented with rule-based and adaptive algorithms to standardize the images. The back landmark detection module (120a) and the landmark guided RCI synthesis module (130a) utilise computerised deep learning techniques. The RGBD input image (102a) of a region of interest of the spine of a subject is acquired by an RGBD image acquisition device for providing an RGBD input image (102a).
Description
The present invention relates to a device, process and system for diagnosing and determining the spinal alignment of a person. More particularly, the present invention provides a device, process and system for diagnosing and determining the spinal alignment of a subject without the use of radiation.
Spinal health is a crucial factor for the overall health of a person. The spine provides for key structural support to our body, and as such any disorders or abnormalities therein may lead to serious adverse effects to both the physical appearance and the cardiovascular, pulmonary and neural system of a person.
From a clinical background perspective, spine malalignment includes two patient cohorts:
(i) patients with scoliosis, and
(ii) and patients with degenerative spinal disorders and deformity and possibly neck or back pain.
Scoliosis is a three-dimensional (3D) deformity of the spine defined as a Cobb angle1 (CA: measured by an angle formed by the upper endplate of the uppermost tilted vertebra and the lower endplate of the lowest tilted vertebra of the structural curve) greater than 10 degrees on standing plain radiographs.
Among all types of scoliosis (i.e., idiopathic, congenital, neuromuscular, and syndromic) , adolescent idiopathic scoliosis (AIS) is most common in the paediatric population, with a prevalence up to 2.2%of boys and 4.8%of girls2, 3 in Hong Kong.
Untreated cases may progress rapidly during the pubertal growth spurt, causing body disfigurement, cardiopulmonary compromise, and back pain4, 5.
In addition, the degeneration of the spine may damage the surrounding muscles, ligaments and joint structures which can worsen the pain and cause additional physical limitations. Early detection and interventions to prevent curve progression is therefore critical6.
Screening using forward bending assessment and scoliometer measurement are the main options to identify individuals with AIS7, 8. Radiographic examination is the reference standard for quantify AIS severities and curve types9. AIS follow-ups and progression monitoring require repetitive radiographic examinations. 10 Children are especially sensitive to radiation due to the higher metabolic activity of their cells11. Thus, accurate and radiation-free approaches12 for spine alignment analysis is desirable.
Non-radiation techniques for scoliosis assessment have been studied for years. Previous methods include 3-dimentional ultrasound13, digital inclinometer14, rasterstereography15 and electrogoniometer16.
Deep learning has made considerable progress in image generation and transformation17, 18.
Clinically, such techniques have been used for medical image synthesis (MIS) to facilitate the clinical workflow19, 20, for example, treatment planning21 and PET attenuation correction22. Despite the diverse applications, most of the studies focused on image synthesis between two medical modalities23, 24, and seldom explored the feasibility of image synthesis between medical and optical imaging systems.
Spine malalignment been identified as the leading global cause of disability in most countries in 2015 and a large percentage can be contributed to deformity.
As such, the impact of this disease is recognised globally to growing number of patients with back pain and deformity, there is an increased interest in improving our knowledge of its pathogenesis, of optimal corrective surgeries and on the impact on health-related quality of life.
Hence in the past decade, much effort has been made to determine ideal alignment parameters, fusion level selection, corrective techniques and instrumentation strategies.
Object of the Invention
It is an object of the present invention to provide a device, process and system for diagnosing and determining the spinal alignment of a subject without the use of radiation, which overcomes or at least partly ameliorates at some deficiencies as associated with the prior art.
Spine malalignment is known to be linked with all the subsequent diseases such as scoliosis and back pain. Thus, to measure and monitor spine malalignment is critical for the effective management of spine health of a subject, particularly children and adolescents.
However, conventional methods of the existing art typically involve the use of radiographic examination and radio-exposure to a subject. As is known, radiographic exposure to all subjects should be minimised, particularly for children, immune compromised persons, and pregnant subjects.
Thus, in view of the problems with the prior art as identified by the present inventor, a novel system, process and device that can identify alignment parameters and measurement of the spine of a subject, with no radio-exposure has been proposed by the present inventor, which would provide for more frequent examination and also more ease pre-post-treatment outcome comparisons.
Furthermore, such a novel system, process and device as proposed by the present inventor, would also provide for ease or screening of the spines or spinal regions of subjects.
In a broad general aspect, the present invention is directed to providing a radiograph-comparable image (RCI) of a region of interest (ROI) of the spine of a patient or subject.
The RCI according to the present invention, is a light-based RCI synthesis system, acquired from a Red Green Blue-Depth (RGBD) image of the naked spine of a patient.
Advantageously, this obviates necessity to expose a patient to radiation, and this is in particular important for people requiring repetitive images, such as in scoliosis, whereby children may need to be imaged regularly, or at least assessed regularly so as to determine the amount of progression of the treatment as well as the effectiveness of the treatment.
It is known that children are especially sensitive to radiation, including being due to the higher metabolic activity of their cells, and providing a device, system and process of the present invention, significantly reduces exposure of children to radiation.
In a first aspect, the present invention provides computerized system for providing a radiograph-comparable image (RCI) of a region of interest (ROI) spinal region of a subject, said system comprising:
a Red Green Blue-Depth (RGBD) standardization module, for receiving an RGBD input image of the ROI of a subject and for providing a standardised RGBD image of said ROI;
a back landmark detection module, for detecting anatomic landmarks from the standardised RGBD image of said ROI; ans
a landmark guided RCI synthesis module, for providing a synthesized RCI of the ROI of the spine of the subject from the detected anatomical landmarks from the back landmark detection module and from the standardised RGBD image;
wherein the Red Green Blue-Depth (RGBD) standardization module is implemented with rule-based and adaptive algorithms to standardize the images, and
wherein the back landmark detection module and the landmark guided RCI synthesis module utilise computerised deep learning techniques; and
wherein a RGBD input image of a region of interest of the spine of a subject is acquired by an RGBD image acquisition device for providing an RGBD input image.
The system may further include a quantitative alignment analysis module, for analysing the alignment of the spine of the subject from the RCI, and wherein the quantitative alignment analysis module utilise computerised deep learning techniques. The alignment analysis module may provide a predictive analysis of the Cobb angle of the subject. The alignment analysis module may provide the severity of any spinal deformity of the subject, and the curve classification of the defomity of the spine of the subject.
The system may further include a radiographic standardisation module which is implemented with rule-based and adaptive algorithms, wherein the radiographic standardisation module receives radiographic image of an ROI of a spine and provides a standardised X-ray, and wherein the RCI synthesis module further provides the synthesized RCI of the ROI of the spine of the subject from the standardised X-ray. The X-ray image may be an X-ray image of the ROI of the subject
The input image is a two-dimensional (2D) image or a three-dimensional (3D) image.
The system utilsies generative Artificial Intelligence (AI) is utilised to generate spine alignment using the RGBD input image. The spine alignment may be three-dimensional spine alignment.
In a second aspect, the present invention process operable on a computerized system for providing radiograph-comparable image (RCI) of the spinal region of a subject, said system comprising a Red Green Blue-Depth (RGBD) standardization module; a back landmark detection module and a landmark guided RCI synthesis module; ; wherein the Red Green Blue-Depth (RGBD) standardization module is implemented with rule-based and adaptive algorithms to standardize the images; and wherein the back landmark detection module and the landmark guided RCI synthesis module utilise computerised deep learning techniques, said process including the steps of:
(i) acquiring an RGBD input image of a region of interest of the spine by way of an RGBD image acquisition device;
(ii) receiving the RGBD input image by the back landmark detection module;
(iii) synthesizing an RCI by way of landmark guided RCI synthesis module, and
(iv) analysing the alignment of the spine of the subject from the RCI, by way of the quantitative alignment analysis module.
The system may further include a quantitative alignment analysis module, for analysing the alignment of the spine of the subject from the RCI, and wherein the quantitative alignment analysis module utilise computerised deep learning techniques. The alignment analysis module may provide a predictive analysis of the Cobb angle of the subject. The alignment analysis module may provide the severity of any spinal deformity of the subject, and the curve classification of the defomity of the spine of the subject.
The system may further include a radiographic standardisation module which is implemented with rule-based and adaptive algorithms, wherein the radiographic standardisation module receives radiographic image of an ROI of a spine and provides a standardised X-ray, and wherein the RCI synthesis module further provides the synthesized RCI of the ROI of the spine of the subject from the standardised X-ray. The X-ray image may be an X-ray image of the ROI of the subject
The input image is a two-dimensional (2D) image or a three-dimensional (3D) image.
The system utilsies generative Artificial Intelligence (AI) is utilised to generate spine alignment using the RGBD input image. The spine alignment may be three-dimensional spine alignment.
In order that a more precise understanding of the above-recited invention can be obtained, a more particular description of the invention briefly described above will be rendered by reference to specific embodiments thereof that are illustrated in the appended drawings. The drawings presented herein may not be drawn to scale and any reference to dimensions in the drawings or the following description is specific to the embodiments disclosed.
Figure 1a shows a schematic representation of a first embodiment of a system according to the present
invention;
Figure 1b shows a schematic representation of a second embodiment of a system according to the present invention;
Figure 1c shows a schematic representation of a first embodiment of a process according to the present invention;
Figure 1d shows a schematic representation of deep learning approach to synthesize RCI from nude back images containing RGB and depth information (RGBD) , in accordance with the present invention;
Figure 1e shows a schematic representation of an application and evaluation of the AI system according to the present invention;
Figure 1f depicts the proportion of different severity levels and genders in all participants, in evaluation of the present invention;
Figure 1g shows an example of aligned RGB, depth and corresponding radiograph, according to the present invention;
Figure 2a shows statistical analysis and visual results of the landmarks detection on the bareback, for evaluation the present invention;
Figure 2b shows the statistics and error distribution for each landmark using violin plot, box plot, and scatter plot in Figure 2a
Figure 3a and Figure 3b show normal-mild severity of curvature of the back of a subject;
Figure 3c and Figure 3d show moderate severity of curvature of the back of a subject;
Figure 3e and Figure 3f show severe severity of curvature of the back of a subject;
Figure 4a shows a regression plot to quantitatively assess the validity of the CA parameters estimated from the synthesized RCI, according to the present invention;
Figure 4b shows a Bland-Altman plot for performance on cobb angle prediction for the present invention;
Figures 4c - 4f show results of classification tasks, namely severity grading (3 types: normal-moderate, moderate, and severe) , curve detection, and the curve type prediction (2 types: T and TL/L) ;
Figure 5 shows visual comparison of practical CA measuring in clinics using the original radiographs (left) and the synthesized RCI counterparts (right) , according to the present invention;
Figure 6 shows a sketch map of experimental settings for RGBD image collection, according to the present invention;
Figure 7 shows the real experimental setting for RGBD image collection, according to the present invention;
Figure 8a shows a representation of image acquisition for image standardization according to the present invention;
Figure 8b RGBD image standardization on the left, and X-ray standardization on the right;
Figure 8c shows depth image processing;
Figure 9 shows the overview of the light-based RCI synthesis system, according to the present invention;
Figure 10 shows the detailed architecture structure of HRNet backbone in the back landmark detection module, according to the present invention;
Figure 11 shows architecture structure of ResNet_9blocks and 5-layer PatchGAN models in the RCI synthesis module, according to the present invention;
Figures 12a - 12f shows the Impacts of the selection of probability threshold on retrieval rate and distance metrics (MED and MMD) for 6 anatomical landmarks on bake image, according to the present invention;
Figures 13a-13fshows landmark detection learning curves for each proportion of training data, according to the present invention;
Figures 14a -14f shows RCI synthesis learning curves for each proportion of training data, according to the present invention; and
Figures 15a and 15b shows Loss value on prospective testing dataset in terms of the subset proportion.
1. Invention General Background and Overview
Referring to Figures 1a - 1g, the pipeline of light-based RCI synthesis system and the demographics of data, and illustration of the pipeline of the light-based RCI synthesis system of the present invention is provided.
Referring to Figure 1a, Figure 1b and Figure 1c, there is shown a schematic representation of an exemplary embodiment of a system and a process according to the present invention.
As is shown in Figure 1a, an embodiment of a system 100a of the present invention is shown, for providing a radiograph-comparable image (RCI) of a region of interest (ROI) spinal region of a subject. The system 100a includes:
(i) a Red Green Blue-Depth (RGBD) standardization module 110a, for receiving an RGBD input image 102a of the ROI of a subject and for providing a standardised RGBD image 104a of said ROI;
(ii) a back landmark detection module 120a, for detecting anatomic landmarks from the standardised RGBD image 104b of said ROI; and
(iii) a landmark guided RCI synthesis module 130a, for providing a synthesized RCI 132a of the ROI of the spine of the subject from the detected anatomical landmarks from the back landmark detection module and from the standardised RGBD image 104a;
The Red Green Blue-Depth (RGBD) standardization module 110a is implemented with rule-based and adaptive algorithms to standardize the images.
The back landmark detection module 120a and the landmark guided RCI synthesis module 130a utilise computerised deep learning techniques.
An RGBD input image of a region of interest (ROI) of the spine of a subject is acquired by an RGBD image acquisition device for providing an RGBD input image 102a.
In the present embodiment, the system 100a also includes a quantitative alignment analysis module 140a in communication with the RCI synthesis modujel 130a, for analysing the alignment of the spine of the subject from the RCI, and wherein the quantitative alignment analysis module 140a utilises computerised deep learning techniques, and provides a predictive output 142a.
The predictive output can be the Cobb angle of the subject.
The alignment analysis module 140a provides the severity of any spinal deformity of the subject, and the curve classification of the defomity of the spine of the subject.
Referrig now to Figure 1 b of a system and a process according to the present invention.
As is shown in Figure 1 a, a further embodiment of a system 100b of the present invention is shown, for providing a radiograph-comparable image (RCI) of a region of interest (ROI) spinal region of a subject. The system 100b includes:
(i) a Red Green Blue-Depth (RGBD) standardization module 110b, for receiving an RGBD input image 102 of the ROI of a subject and for providing a standardised RGBD image 104a of said ROI;
(ii) a back landmark detection module 120b, for detecting anatomic landmarks from the standardised RGBD image 104b of said ROI; and
(iii) a landmark guided RCI synthesis module 130b, for providing a synthesized RCI 132b of the ROI of the spine of the subject from the detected anatomical landmarks from the back landmark detection module and from the standardised RGBD image 104b;
The Red Green Blue-Depth (RGBD) standardization module 110b is implemented with rule-based and adaptive algorithms to standardize the images. The back landmark detection module 120b and the landmark guided RCI synthesis module 130b utilise computerised deep learning techniques.
An RGBD input image of a region of interest (ROI) of the spine of a subject is acquired by an RGBD image acquisition device for providing an RGBD input image 102b.
In the present embodiment, the system 100a also includes a quantitative alignment analysis module 140b in communication with the RCI synthesis modujel 130b, for analysing the alignment of the spine of the subject from the RCI, and wherein the quantitative alignment analysis module 140b utilises computerised deep learning techniques, and provides a predictive output 142b.
In the present embodiment, the system 100b furgter includes radiographic standardisation module 150b which is implemented with rule-based and adaptive algorithms, wherein the radiographic
standardisation module 150b receives radiographic image 152b of an ROI of a spine and provides a standardised X-ray 154b, and wherein the RCI synthesis module 130b further provides the synthesized RCI of the ROI of the spine of the subject from the standardised X-ray 154b.
Referring now to Figure 1c, there is shown an exemplary embodiment of a process 100c according to the present invention.
The process 100c operable on a computerized system such as is described with reference to the embodiments of Figure 1a and Figure 1b.
The process 100c provides a radiograph-comparable image (RCI) of the spinal region of a subject, using a system comprising a Red Green Blue-Depth (RGBD) standardization module; a back landmark detection module and a landmark guided RCI synthesis module.
Again, the Red Green Blue-Depth (RGBD) standardization module is implemented with rule-based and adaptive algorithms to standardize the images; and wherein the back landmark detection module and the landmark guided RCI synthesis module utilise computerised deep learning techniques, said process including the steps as as follows:
Step (i) 110c -acquiring an RGBD input image of a region of interest of the spine by way of an RGBD image acquisition device;
Step (ii) 120c -receiving the RGBD input image by the back landmark detection module;
Step (iii) 130 -synthesizing an RCI by way of landmark guided RCI synthesis module, and
Step (iv) 140 -analysing the alignment of the spine of the subject from the RCI, by way of the quantitative alignment analysis module.
The present inventor has sought to accurately quantify AIS spine malalignment without using any radiation techniques. Thus, Light-Based Radiograph-Comparable Image (RCI) Synthesis was explored, and a deep learning approach to synthesize RCI from nude back images containing RGB and depth information (RGBD) was developed, as shown in Figure 1d) , which is implemented in a similar manner as described above in relation to Figure 1a and Figure 1d. Different from the previous techniques of the prior art, the models generate synthesized RCI containing anatomical morphology information of the spine to accurately quantify spinal alignment. \The reliability of models of the present invention prospectively validated with multiple tasks in two clinics, including back landmark auto-detection, RCI synthesis, and scoliosis severity and curve type classification. Technology of the present invention has the potential to facilitate radiation-free, fast, and accurate AIS analysis.
The system 100d includes a pipeline consisting of 1) a RGBD and radiograph standardization module 110d, 2) a back landmark detection module 120d, 3) a landmark guided RCI synthesis module 130d and 4) a quantitative alignment analysis module 140d.
The module 110d first is implemented with rule-based and adaptive algorithms to standardize the images, while the last three modules, modules 1230d, 130d and 140d adopt deep learning techniques.
Figure 1e shows Application and evaluation of the AI system. RGBD images captured with the
smartphone and equipment of the present invention are transmitted to the cloud data centre and backend AI server that hosting the light-based RCI synthesis and AlignPro system for analysis. Then, the results can be instantly transmitted and displayed back to the smartphone and equipment.
AlignProTM is a robust deep learning-based prediction of spinal alignments irrespective of image qualities acquired from smartphone photographs of radiographs displayed on PACS (picture archiving and communication system) , by Conova Medical Technology Limited.
Referring to Figure 1f proportion of different severity levels and genders in all participants.
Referring to Figure 1g, there is shown an example of aligned RGB, depth and corresponding radiograph. Abbreviation definition: RCI: radiograph-comparable image; RGB: Red Green Blue; RGBD: Red Green Blue-Depth; AI: artificial intelligence.
2. Methodology
The methodology for the validation and exemplary implementations of the present invention, is as follows:
2.1 SUMMARY OF VALIDATION OF PRESENT INVENTION
Background: Adolescent idiopathic scoliosis (AIS) is the most common type of spinal disorder affecting children. Clinical screening and diagnosis require physical and radiographic examinations, which are either subjective or increase radiation exposure.
Therefore, a radiation-free portable system and device developed and validated a utilising light-based depth sensing and deep learning technologies to analyse AIS by landmark detection and image synthesis.
Methods: Consecutive patients with AIS attending two local scoliosis clinics in Hong Kong between October 9, 2019, and May 21, 2022, were recruited. Patients were excluded if they have psychological and/or systematic neural disorders that can influence the compliance of the study and/or the mobility of the patients.
For each participant, a Red Green Blue-Depth (RGBD) image of the nude back was collected using in-house radiation-free device of the present invention. Manually labelled landmarks and alignment parameters by spine surgeons were considered as the ground truth (GT) . Images from training and internal validation cohorts (n=1, 936) were used to develop the deep learning models.
The model was then prospectively validated on another cohort (n=302) which was collected in Hong Kong and had the same demographic properties as the training cohort. The prediction accuracy was of the model on nude back landmark detection was evaluated as well as the performance on radiograph-comparable image (RCI) synthesis. The obtained RCI contains sufficient anatomical information that can quantify disease severities and curve types.
Findings: The model of the present invention had a consistently high accuracy in predicting the nude or naked back anatomical landmarks with a less than 4-pixel error regarding the mean Euclidian and Manhattan distance.
The synthesized RCI for AIS severity classification achieved a sensitivity and negative predictive value of over 0.909 and 0.933, and the performance for curve type classification was 0.974 and 0.908, with spine specialists’ manual assessment results on real radiographs as GT. The estimated Cobb angle from synthesized RCIs have a strong correlation with the GT angles (R2=0.984, p<0.001) .
Interpretation: The radiation-free medical device of the present invention, powered by depth sensing and deep learning techniques can provide instantaneous and harmless spine alignment analysis which has the potential for integration into routine screening for adolescents.
Details of the evaluation of the present invention, are as follows.
2.2 DATA COLLECTION AND PREPARATION.
From October 9, 2019, to January 15, 2022, all consecutive patients with AIS aged between 10 and 18 years old were recruited to form the training and internal validation dataset for the technology development. From January 16, 2022, to May 21, 2022, consecutive patients with AIS who attended two clinics in Hong Kong were recruited in a prospective testing dataset for the present invention.
Patients were excluded if they have psychological and/or systematic neural disorders that can influence the compliance of the study and/or the mobility of the patients.
Other exclusion criteria are:
1) any trauma that may impair posture and mobility,
2) severe dermatological conditions that may impair the optical imaging results, and
3) any known oncological disease.
The dataset includes patients from two scoliosis clinics: 1) The Duchess of Kent Children’s Hospital and, 2) Queen Mary Hospital in Hong Kong.
The sex of each participant was determined in terms of physiological characteristics (according to the information on his/her ID card) . For each patient, a bareback RGBD image and a whole-spine standing posteroanterior radiograph were acquired. The RGBD imaging system consists of an RGBD camera25, a portable computer and a self-designed mobile stand as shown in Figure 6 and y Figure 7.
The radiograph was acquired using the EOSTM (Imaging, Paris, France) biplanar stereoradiography machine.
The differences between the two clinics are minimal and a detailed study protocol was prepared to guide the experimental settings and patient behaviours when collecting data. To avoid any mis-operation, the involved technicians and clinical assistant are well-trained before joining the study. The proposed system consists of four modules as shown in Figure. 1a: 1) a RGBD and radiograph standardization module, 2) a back landmark detection module, 3) a landmark guided RCI synthesis module, and 4) a quantitative alignment analysis module.
2.3 IMAGE PRE-PROCESSING
Image pre-processing was conducted to standardize the images and resolve the misalignment of RGBD images and radiographs obtained from different imaging systems. A novel image registration
algorithm was developed for this purpose to align the RGBD and radiographs according to the landmarks of the 7th cervical vertebra (C7) and the tip of coccyx (TOC) in radiographs and RGBD images.
To eliminate the capture angle impacts when collecting RGBD images, random sample consensus (RANSAC) has been used to standardize the image capture angle. Each captured image is standardized according to the checkboard plane which is placed perpendicular to the ground and beside the patient as shown in Figure. 8a. Finally, both aligned radiograph and RGBD image were cropped to a 512×256 patch, only containing the back region as shown in Figure 8b.
2.4 DATA LABELLING
For RGBD images, 6 anatomical landmarks on the nude back were annotated according to experienced surgeons’ recommendations, including 1) C7, 2) left inferior scapular angle, 3) right inferior scapular angles, 4) left posterior inferior iliac spine (PIIS) , 5) right PIIS, and 6) TOC as shown in Figure. 1d.
For radiographs, two landmarks (i.e., C7 and TOC) were annotated as the registration reference as shown in Figure 8c. All landmarks were manually annotated by spine surgeons with more than 20 years of clinical experience. The CA of each radiograph is manually labelled by two spine specialists. Measurements of the CAs had an absolute inter-rater variability from 4° to 6° (mean=4.5°±SD 0.6) between the two spine specialists20.
2.5 DEFINITION OF DEFORMITY SEVERITY, CURVE TYPES
The severity of spine deformity was classified into 3 categories according to clinical standard26, 27. Curves with a CA larger than 40° were classified as severe, between 20° and 40° were considered as moderate, between 0° and 20° were considered as normal-mild.
The deformity severity assessment standard and clinical management are presented in Table 1. The curve type was regarded as thoracic (T) if the curve apex was between the 1st to the 11th thoracic vertebrae and considered as thoracolumbar or lumbar (TL/L) if the curve apex was between the 12th thoracic vertebra and the 5th lumbar vertebrae.
Table 1 -Deformity severity assessment standard and clinical management
2.6 LIGHT-BASED RCI SYNTHESIS SYSTEM
The light-based RCI synthesis system consisted of a back landmark detection module and an RCI synthesis module as shown in Figure 9. The back landmark detection module adopted the modified HRNet backbone as shown in Figure 10 to predict the 6 anatomical landmarks. It utilised different
branches to extract multi-scale features from the images and then integrated these features to achieve better model outputs.
The number of channels output by the module was 6 and each channel was a heatmap, providing the probability of the position of the landmark. The RCI synthesis module adopted the CycleGAN framework but used ResNet as the generator. It took the concatenation of the RGBD images (4-channels) and the 6 detected landmark heatmaps (6-channels) , in total 10-channel data, as input and outputs RCI image. The details of the ResNet model and PatchGAN model can be found in Figure 11.
The quantitative alignment analysis module utilised the online platform (AlignPro27) for automatic CA prediction. Both original radiographs and generated RCIs were sent to the server of AlignPro and then the endplate landmarks of the end vertebrae located in different spinal curves could be obtained.
The obtained landmarks were further analysed and adjusted by senior clinicians to eliminate noise in the automatic predictions and fill the missing landmarks without reference to the original radiographs. The final output CAs from this module were calculated according to the landmarks reviewed by the clinicians using the AlignPro platform.
2.7 RGBD IMAGING DEVICE
The RGBD imaging device mainly consists of an RGBD camera, a portable computer, and a self-designed mobile stand. The RGBD camera is an Azure Kinect DK camera which is used to record both appearance and depth information of the nude back.
A portable computer (with Windows OS, an Intel Core i5 CPU and 16GB memory) is connected to the camera and users can operate the camera for RGBD image capturing and archiving through self-developed software of the present invention. Both the camera and computer are installed on a portable stand for the convenience of users. The appearance of device for the data collection is shown in Figure 1e.
2.8 PERFORMANCE METRICS FOR LANDMARK DETECTION
Mean Euclidean distance (MED) and mean Manhattan distance (MMD) were adopted as quantitative measurements to evaluate the performance of the landmark detection. MED measures the average Euclidean distance between the predicted landmark and GT landmark, while MMD also measures the average distance along the axes. The definition of these two measurements is:
where denotes the coordinates of the GT landmark, denotes the coordinates of the predicted landmark, M is the number of landmarks and N is the patient number.
2.9 QUALITY METRICS FOR THE RCI SYNTHESIS
Five image quality metrics, namely Fréchet inception distance (FID) 28, Visual information fidelity (VIF) 29, Learned perceptual image patch similarity (LPIPS) 30, Natural image quality evaluator (NIQE) 31, and Blind/referenceless image spatial quality evaluator (BRISQUE) 32, were selected to evaluate the RCI synthesis performance (details refer to Supplementary Section 2.2) .
2.10 STATISTICAL ANALYSIS
Referring now to Figure 2a, there is shown statistical analysis and visual results of the landmarks detection on the bareback. A combination of violin plot, box plot and scatter plot of the MED (left figure) and MMD (right figure) values for the 6 landmarks. b, Depth and combined heatmaps of the 6 bareback anatomical landmarks for landmark detection. Here, is presented 3 samples for each disease severity (normal-mild, moderate and severe) . Left side presents the contour plot of the depth image of each patient with predicted and GT landmarks.
The colourbar demonstrates the height of the surface measured in millimeter in terms of the height of landmark C7. The higher the region is, the closer the region to the camera (e.g., red means closer to the camera while blue means away from the camera) . Right side displays the RGB image of the patient as background, and the landmark heatmap as foreground.
The colourbar denotes the probability of the appearance of the landmarks (e.g., red means higher probability and blue means lower probability) . Abbreviation definition: MED: Mean Euclidean distance; MMD: mean Manhattan distance; GT: ground truth; C7: the 7th cervical vertebra. The violin plot shows the distribution of the error values. The box plot is defined as a graphical representation of the distribution of the data, with the box representing the interquartile range, the line inside the box denoting the median, and the whiskers representing the range of data.
To evaluate the performance of the model on back landmark detection of the present invention, the statistics and error distribution for each landmark using violin plot, box plot, and scatter plot in Figure 2b were analysed. The error was measured using MED and MMD. The box plot indicates the 1st, 2nd, and 3rd quartile (Q1, Q2 and Q3) , interquartile range (IQR) , minimum and maximum value (exclude the outliers) of the data points. The violin plot provides an intuitive view of the error distribution. The region smaller than zero was cropped due to the nonnegativity of MED and MMD. Meanwhile, the region larger than the 95th percentile was also cropped to ignore the outliers. The scatter plot shows how the error data points distributed along the y-axis.
Referring now to Figure 4, there is now shown performance evaluation of the model on CA prediction of the present invention. A Linear regression analysis of CAs on the prospective test data. The x-axis denotes the CA degrees predicted from synthesized RCIs of the present invention, while the y-axis denote the CA degrees obtained from GT radiographs. b, Bland-Altman plots assessing the agreement of CAs obtained from synthesized and GT radiographs on the prospective test data.
The x-axis represents the average degree of CAs obtained from synthesized and GT Radiographs, while y-axis denotes the difference between them. c, confusion matrix for the severity grading (3 types: normal-mild, moderate, and severe) . d-e, confusion matrices for detection of the presence T and TL/L
curves, respectively. f, confusion matrix for major curve type prediction. Abbreviation definition: CA: Cobb angle; RCI: radiograph-comparable image; GT: ground truth; T: thoracic; TL/L: thoracolumbar/lumbar; SD: standard deviation. The p-value helps to determine the correlation between the predicted CA using real radiographs and RCIs, and the small value (p-value < 0.0001) indicates they have strong correlation.
To quantitatively assess the validity of the CA parameters estimated from the synthesized RCI, linear regression and Bland-Altman analysis were conducted. In the regression plot as shown in Figure 4a, the regression line (blue line) , 95%confidence interval line (green dashed line) of the predictions, and the ideal correspondence (red line) between the predicted CAs on synthesized RCIs and GT radiographs are presented. The Bland-Altman analysis was performed between the mean of the predicted CAs on RCIs and GT radiographs and their residual to assess the agreement between them.
Three classification tasks, namely severity grading (3 types: normal-moderate, moderate, and severe) , curve detection, and the curve type prediction (2 types: T and TL/L) , were assessed using confusion matrix analysis as shown in Figure 4c-f. For curve detection, the patients from healthy controls according to the normal range of CA (CA < 10°) were distinguished. To evaluate the severity grading and curve type classification performance of the deep learning model of the present invention, five statistical measurements were calculated, including sensitivity (Sn) , specificity (Sp) , precision (Pr) , negative predictive value (NPV) , and accuracy (Acc) according to the following equations.
where TP, TN, FP, and FN refer to true positive, true negative, false positive, and false negative predictions, respectively. All the statistical analysis has been done using Python (v3.8) and several python packages, including Numpy (v1.18.5) , SciPy (v1.5.2) , Ptitprince (v0.2.6) , pandas (v1.1.3) , seaborn (v0.11.0) , and Matplotlib (v3.3.2) .
The sub-sampling sample size determination method (SSDM) has been used to examine how sample size affects the two models of the present invention. Unfortunately, SSDM's application to deep learning in medical imaging is still in its early stages, and there is no suitable SSDM to the problem studied in study. As a result, a practical SSDM was chosen, i.e., a curve-fitting method33, in this paper to empirically assess the model’s effectiveness at various sample size proportions. The training data was randomly sub-sampled by a proportion factor of 4%, 8%, 16%, 32%, 64%, 100%, and for each factor,
the models were trained 10 times as shown in Figure 13 and Figure 14. For each training run, the weights from the training history with the minimal validation loss were stored and then assessed on the prospective testing dataset, yielding 10 test loss estimates for each proportion factor as shown in as shown in Figure 15. For both deep learning models, with the increasing of the training samples, the model performance was improved. For landmark detection model, the sampling proportion should be larger than 32%to ensure the model performance and stability while for RCI synthesis model, all training samples should be used to achieve the best model performance.
3. Results
The results of the methods and methodology for the validation and implementation of embodiments of the present invention, are as follows.
3.1 DATASETS
Between October 9, 2019, and May 21, 2022, 2238 participants were enrolled. The demographic information of the technology development and prospective testing cohorts is presented in Table 2. The study population included 1,936 patients (1,410 female, 72.8%) in the training and internal validation cohorts for the model development. An additional 302 patients (226 female, 74.8%) were recruited prospectively from two local spine clinics for performance testing.
There was no patient overlap between the model development and prospective testing cohorts. In the training and validation cohort, there were 579 patients (29.9%) classified as normal-mild, 1, 132 patients (58.5%) classified as moderate, and 228 patients (11.6%) classified as severe. In the prospective cohort, there were 85 patients (28.1%) classified as normal-mild, 184 patients (60.9%) classified as moderate, and 33 patients (10.9%) classified as severe. The characteristics of different cohorts are summarized in Table 2.
The percentage of patients with different severities, and for each severity, the proportion for each sex is also presented as shown in Figure 1c.
Abbreviation definition: CA: Cobb angle; SD: standard deviation;
BMI: body mass index; T: thoracic; TL/L: thoracolumbar/lumbar.
Table 2 -Demographic information of the study population.
3.2 PERFORMANCE OF BACK LANDMARK DETECTION ON THE BACK IMAGES.
Table 3 evaluates the landmark detection on prospective test dataset. For each anatomical landmark, both mean and standard deviation (SD) were reported. The detection of the C7 and the TOC landmarks achieved the best performance (mean±SD) compared with the other landmarks in terms of MED and MMD (C7: MED=1.0±0.5, MMD=1.2±0.6; TOC: MED=1.0±0.5, MMD: 1.3±0.6) . The detection of left and right PIIS achieved an inferior performance (Left PIIS: MED=1.6±1.2, MMD=2.0±1.3; Right PIIS: MED=1.7±1.2, MMD=2.1±1.2) .
For the detection of left and right inferior scapular angles, the average values of both the MED and MMD achieved less than 4 pixels (Left inferior scapular angle: MED=3.0±2.1, MMD=3.6±2.4; Right inferior scapular angle: MED=2.9±2.3, MMD=3.5±2.4) . For all 6 landmarks, the values of both MED and MMD follow unimodal distribution quantitatively as analysed by violin plot, box plot, and scatter plot as shown in Figure 2a. Visual comparisons for landmark detection on back depth contours are presented in Figure 2b with the heatmaps of the 6 landmarks. The visual results of examples of landmark detections for 9 patients diagnosed with different AIS severities and types were presented.
Abbreviation definition: MED: mean Euclidean distance; MMD: mean Manhattan distance; SD: standard deviation. C7: 7th cervical vertebra; PIIS: posterior inferior iliac spine; TOC: tip of coccyx. P1 - P6 correspond to the landmarks in Fig. 2.
Table 3-Evaluation metrics on landmark prediction between the 6 predicted landmarks and their corresponding Ground Truth landmarks on the prospective test dataset.
For landmark detection, the performance of 4 classical deep learning architectures with similar number of parameters have been compared (Table 4) . For all the 6 anatomical landmarks, HRNet34 backbone achieves the best performance in terms of MED and MMD (y Table 5) . The direct outputs of back landmark detection module are 6 landmark heatmaps.
A high value means the probability of the landmark located at this corresponding position is high and vice versa. A probability threshold to filter the heatmaps to evaluate the quality of the output heatmaps was used. For different threshold values (from 0.1 to 0.6) , HRNet achieved highest landmark retrieval rate for all 6 landmarks, and the retrieval rate was relatively stable when changing the threshold value compared with other architectures.
*FLOPs stands for floating point operations. Inference memory means the required memory during the model testing.
Table 4- Summary of the backbones used for landmark detection.
*Bold text means the best performance. C7: the 7th cervical vertebra; PIIS: posterior inferior iliac spine; TOC: tip of coccyx. PIIS: posterior inferior iliac spine.
Table 5 - Comparison of the performance on landmark detection using different deep learning models.
*Note that, the model is considered to be failed if the retrieval rate is smaller than 0.1, and the dash symbol (-) to
denote these items is used. C7: the 7th cervical vertebra; PIIS: posterior inferior iliac spine; TOC: tip of coccyx.
*Note that, the model is considered to be failed if the retrieval rate is smaller than 0.1, and the dash symbol (-) to
denote these items is used. C7: the 7th cervical vertebra; PIIS: posterior inferior iliac spine; TOC: tip of coccyx.
Table 6 -Comparison of performance on landmark retrieval rate in terms of probability threshold on the test data.
3.3 PERFORMANCE OF DEEP LEARNING FRAMEWORK FOR RCI SYNTHESIS.
Referring to Figures 3a-3f, there is shown RCI synthesis results from the RCI synthesis module. Each subfigure presents a case with a certain severity level and curve type of scoliosis. From left to right, each subfigure presents the RGB image of the patient’s nude back, depth image of the patient’s nude back, the GT radiograph of the spine region and the synthesized RCI. Figure 3a and Figure 3b exhibit the RCI synthesis results for normal-mild cases.
Figure 3c and Figure 3d show the moderate cases. Among them, the patient in Fig. c has both T and TL curve while the one in Fig. d has only TL curve. Figure 3e and Figure 3f present the severe cases. Among them, the patient in has T curve while the patient in Fig. f has both T and TL curve. Abbreviation definition: RCI: radiograph-comparable image; T: thoracic; TL: thoracolumbar; RGB: Red Green Blue; GT: ground truth.
Patients with different severity levels and spinal curve types are presented to demonstrate the synthesis performance of the model on various cases as shown in Figure 3 of the present invention. The RGB image, depth image (in grayscale) , synthesized RCIs and GT radiographs are displayed sequentially from the left to right. The normal-mild as shown in Figure 3a and Figure 3b, moderate as shown in Figure 3c and Figure 3d, and severe as shown in Figure 3e and Figure 3f AIS cases are displayed, and for each severity level (except Normal-mild) , two cases with different curve types are visualized.
The pixel-wise metrics such as peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) were unstable and inconsistent with the other metrics, and thus were not suitable to be used as measurements for RCI synthesis, after comparison of 12 combinations of generators and discriminators y Table 7 in terms of 7 quantitative image quality metrics. Except PSNR and SSIM, the other 5 metrics, namely, 1) FID28, 2) VIF29, 3) LPIPS30, 4) NIQE31, and 5) BRISQUE32, performed consistently and indicated that a ResNet35 with 9 blocks as generator and a 5 convolutional layer PatchGAN36 as discriminator achieved the best performance.
*PSNR: peak signal-to-noise ratio; SSIM: structural similarity index; VIF: visual information fidelity; FID: Fréchet inception distance; LPIPS: learned perceptual image patch similarity; NIQE: natural image quality evaluator; BRISQUE: blind/referenceless image spatial quality evaluator. The symbol ↑ means the larger the value is, the better the model performs while symbol ↓ means the lower the value is, the better the model performs. The best performance is bold and highlighted in red. The 2nd best performance is emphasized with underscore and highlighted in blue. For generator, “ResNet_6” means the ResNet architecture with 6 residual blocks while “ResNet_9” means the ResNet architecture with 9 residual blocks. “UNet_128” means UNet architecture (for input 128x128) with 7 different scales of features while “UNet_256” means UNet architecture (for input 256x256) with 8 different scales of features. For discriminator, “5 layers” means the PatchGAN architecture with 5 convolutional layers, “8 layers” means the PatchGAN architecture with 8 convolutional layers, and the PixelGAN refers to a 5-layer CNN with PixelGAN architecture.
Table 7 - Quantitative evaluation of the performance of different deep learning models for RCI synthesis.
3.4 PERFORMANCE ON COBB ANGLE PREDICTION.
The reliability of the synthesized RCIs for CA quantification have a strong correlation with the GT angles (R2=0.984, p<0.001) tested by linear regression. Additionally, the slope of the regression line is 45.99°comparable to the ideal value of 45°. According to the Bland-Altman plot as shown in Figure 4b, the mean difference of CAs obtained from GT and synthesized RCIs is minimal at -0.86.
Referring now to Figure 5, there is shown a visual comparison of practical CA measuring in clinics using the original radiographs (left) and the synthesized RCI counterparts (right) . The first row presents normal-mild cases with single or double spine curves. The second row presents moderate cases with single or double spine curves. The third row presents severe cases with double or triple spine curves. The patient number is printed on the top left corner on the GT radiograph to indicate different patients. The curve type and CA measuring results are printed on the bottom of each radiograph. Abbreviation definition: CA: Cobb angle; RCI: radiograph-comparable image; GT: ground truth.
For each severity, three examples were visualized with different curve types to demonstrate the robustness of the model for heterogeneous cases as shown in Figure 5 of the present invention. Predicted CAs using synthesized RCIs were comparable with the angles measured using GT spine radiographs. The spine morphology of the synthesized RCIs was also close to the GT spine radiographs.
3.5 PERFORMANCE ON SEVERITY GRADING &CURVE TYPE CLASSIFICATION &PREDICTION.
For severity classifications (Table 3) , both specificity (Sp) and accuracy (Acc) exhibited a relatively high score in grading all three severity levels. The negative predictive value (NPV) had a high score (0.981) for Normal-Mild cases (Table 8) . The precision had the highest score for both Moderate cases (0.962) and Severe cases (1.000) (Table 8) .
For curve type classification, the sensitivity (Se) and precision (Pr) achieved the highest scores (T: Se=0.978, Pr=0.969; TL/L: Se=0.974, Pr=0.958) . The confusion matrices for severity classification as shown in Figure 4c comparison between synthesized RCI and GT spine radiographs, curve detection for T curve as shown in Figure 4d as shown in, curve detection for TL/L curve as shown in Figure 4e, and major curve as shown in Figure 4f were illustrated.
Abbreviation definition: NPV: negative predictive value; T: thoracic; TL/L: thoracolumbar/lumbar
Table 8 -Quantitative performance on severity grading and curve type identification on the prospective test dataset.
4. Further Supplemental Detailed Description
4.1. DATASET CONSTRUCTION.
A novel dataset for the development and evaluation of light-based RCI synthesis system of the present invention was constructed. To the best knowledge of the inventor, this is the first dataset consisting of paired RGBD and radiographs to evaluate the model performance on radiograph synthesis. The RGBD images were collected using the Microsoft Azure Kinect DK camera38. This camera can record both RGB images and the depth image of the scene simultaneously with one capture.
The radiographs were obtained from medical machine EOSTM. The machine generated two paired radiographs, i.e., an anteroposterior radiograph and a lateral radiograph. Only the anteroposterior radiograph was included in the dataset as the focus was on the coronal Cobb angle measurement.
The dataset is being collected and enlarged continuously and at the moment of preparing this work, the dataset (at the time of manuscript submission) contains 2, 238 patients. These data were collected between October 2019 and May 2022 in two centers, i.e., The Duchess of Kent Children’s Hospital and Queen Mary Hospital in Hong Kong. All the collected RGBD and radiographs are assessed manually by the specialists and experts according to the standard presented in Table 1.
4.1.1 RGBD photographic protocol
The RGBD imaging system mainly consists of a consumer-grade RGBD camera, i.e., Microsoft Azure Kinect DK camera 38, a portable computer and a self-designed mobile stand. The whole imaging system was deployed in the spine clinic of the study as shown in Figure 6 and Figure 7. To ensure good quality
of captured images, the height of the camera is set to 1 meter and patients were instructed to expose their back with 1.2 meters distance from the camera, standing on the patient anchor.
These settings enable the camera of the present invention to capture affluent depth information as well as the entire back region (the scope of depth camera contains the back region) . A checkerboard paper was used to adjust the camera pose to ensure the imaging plane is perpendicular to the optical axis of the camera lens. The self-developed RGBD standardization algorithm embedded in equipment of the present invention can automatically process the original RGBD images from the camera and output the standardized RGBD images.
4.1.2 Radiographic protocol
The radiograph was acquired using the EOSTM (Imaging, Paris, France) biplanar stereoradiography machine. Patients were required to remove the metallic accessories before being led into the scanning cabin of the machine. Both anteroposterior (AP) and lateral radiographs were scanned simultaneously with a scanning speed of 4.6 cm/s. The radiation dose for AP radiograph is 143.29 mGy. cm2 and the scanning region is ±22.4 cm which contains the patient body. The AP radiographs were collected using the software of the machine and anonymized before sending to radiograph standardization module for image pre-processing to obtain standardized radiographs.
4.1.3 Experimental settings and image collection for RGBD images.
The collection of RGBD images was under relatively rigorous experimental settings to ensure the image quality. Figure. 6. presents the sketch map of the experimental settings for RGBD image collection and Figure 7 provides the corresponding real experimental settings. The Azure Kinect DK camera was installed on a self-developed mobile stand of the present invention and the height of the camera lens was about 0.95 meters (0.95±0.03m) . The patient was required to stand right ahead of the camera. To ensure this, after careful measurement, an anchor (patient anchor) was placed on the ground to denote the standing point of the patient, and each patient was required to stand on this standing point when capturing the RGBD image. The distance between the anchor and camera lens was about 1.2meters (1.2±0.05m) to ensure both RGB camera scope (red rectangle in Figure 6) and depth camera scope (green hexagon in Figure. 6) contained the patient body.
The Azure Kinect DK camera was connected to a computer with the Azure Kinect SDK software 39 installed to collect and standardize the image. During the image collection, all research assistants were instructed and required to capture a short video (1-2 seconds) of the patient nude back. A pair of RGB and depth images were extracted from the same frame of each short video.
4.1.4 Image collection for radiographs.
The radiographs were captured using the EOSTM biplanar stereoradiography machine (Imaging, Paris, France) . A pair of anteroposterior and lateral radiographs were captured each time and only the anteroposterior radiograph was collected for this dataset. The images were deidentified before being sent to the researchers.
4.1.5 Image annotation for RGBD and radiograph.
6 anatomical landmarks were selected that have been proved to be effective for diagnosis of adolescent idiopathic scoliosis (AIS) in clinical applications 40, 41, ] . These landmarks included: 1) the 7th cervical vertebra (C7) , 2) left inferior scapular angle, 3) right inferior scapular angle, 4) left posterior inferior iliac spine (PIIS) , 5) right PIIS, and 6) the tip of coccyx (TOC) . All 6 landmarks in each RGB bareback image were manually labelled by senior surgeons with over 20 years’ clinical experience. Those landmarks were used as the ground truth (GT) to train the AI model of the present invention used in the landmark detection module.
To enable the image registration between the RGBD and X-ray images, two landmarks on each X-ray image were annotated, i.e., C7 and tip of coccyx. The selection of these two landmarks has undergone careful consideration. First, compared with other landmarks, these two landmarks have the most distinctive features to be identified based on the bareback image. The results of landmark detection as shown in Figure 2a also demonstrate this. That is, the detection of C7 and the tip of coccyx achieves minimal error and variance in terms of mean Euclidean distance (MED) and mean Manhattan distance (MMD) . Second, among all the 6 landmarks, they are the only two landmarks directly located on the spine and can be easy to identify on the X-ray image. The two landmarks in each X-ray image were also manually labelled by senior surgeons with over 20 years’ clinical experience.
The cross-modality registration algorithm was designed to utilize these two landmarks to match each pair of RGB bareback image and X-ray image.
4.1.6 Automated image standardization module
The image standardization module consists of RGBD standardization module and radiograph standardization module.
4.1.7 RGBD standardization module
The RGBD standardization module matches each pair of RGB and depth images, normalize the depth image and crops the back region. First, given that the RGB and depth images have different resolutions, the raw images were pre-processed using the inherent function (k4a_transformation_depth_image_to_color_camera) of Azure Kinect SDK to map the depth camera scope to RGB camera scope. As a result, a transformed depth image can be obtained which has the same resolution as the RGB image, and meanwhile achieve the pixel-level alignment between them.
After that, as is shown in Figure 8c the depth image was further processed to remove the background and keep the human foreground. The histogram of the depth image counts the depth value of each pixel position. According to the experimental results, there are two summits in the histogram plot, i.e., the first one is for foreground and the second one is for background. the depth value belonging to the second summit in the depth image is removed and the depth value to the range [0, 1] is normalised for the convenience of model training. Finally, the back region in both RGB and depth images were cropped according to the 6 landmarks. The detailed RGBD standardization pipeline is presented in Figure 8.
4.1.8 Radiograph standardizations module
The X-ray standardization module aligns the X-ray image with the RGB image according to the annotations. Since C7 and TOC have been annotated in both RGB images and X-ray images, they can be used as reference key points to align the RGB and X-ray images, so that the back region can be largely overlapped in both images. The algorithm pipeline is presented in Figure 8a and Figure 8b.
4.2 OVERVIEW OF THE AI SYSTEM IN THE PROPOSED LIGHT-BASED RCI SYNTHESIS SYSTEM.
The overview of the light-based RCI synthesis system is presented in Figure 5, the structure of deep learning models in different modules and the data flow between the modules is shown. The detailed architecture of the models is presented in the Figure 10 and Figure 11 for HRNet backbone and GAN models (generator and discriminator) , respectively.
4.2.1 Back landmark detection module.
The back landmark detection module adopts HRNet 42 backbone for the anatomical landmark detection. The detailed structure of HRNet for end vertebra detection and landmark detection is presented in Figure 10. The convolutional layers were used for feature extraction and presentation learning.
The hyperparameters of each convolutional layer are presented in the parenthesis with format: “c#k#s#p#” , where #denotes a digital number. For example, the layer Conv2d (c64k3s2p1) means the convolutional layer with 64 filters, 3×3 kernel size, 2 stride, and 1 zero-padding size. Different colors are used to denote different types of layers or composite layers. For composite layers, the detailed contents are listed beside the network structure in the figure. The batch-normalization layer (denoted as BatchNorm2d) normalizes the input features through the equation:
where γ=10-5, and β=0.01 in all batch-normalization layers. The activation layer adopted in the network is the widely used rectified linear unit (ReLU) layer which activates the outcome of the previous layer through the following equation:
where x denotes the outputs of the previous layer.
To select the best model for this task comparison of 4 different deep learning frameworks, namely, HRNet 43, UNet [6] , DeepLabV3 44, and FCN 45 is performed. Considering the number of parameters can impact the representation ability of the model, for fairness, the models are adjusted accordingly to make them have roughly equivalent number of parameters by changing the number of filters, number of residual blocks, kernel and stride size of convolutional layers etc. Table 4 demonstrates the summary of the models. In addition to model parameters, the inference memory and FLOPs for different models are also counted. The metric inference memory refers to the required memory when using the model for testing. FLOPs stands for the floating point operations which is used to measure the model complexity. As shown, different models have similar number of parameters which means the representation ability
of different models is close. Therefore, by eliminating the impacts of model representation ability, the model performance is largely related to the architecture of the model. Table 5 compares the performance of different models for landmark detection. As shown, HRNet outperforms the other 3 deep learning models by a large range, especially for the detection of left and right inferior scapular angle landmarks. Combining Table 4 and 5, HRNet uses a modest size of inference memory and relatively smaller computations (FLOPs) to achieve the best performance.
The direct outputs of the 4 deep learning models are landmark heatmaps. The final predicted position of each landmark is the position of the highest value in the corresponding landmark heatmap. The values in a landmark heatmap are between 0 and 1 which can be considered as the probability of the landmark position. Therefore, a lower value in the landmark heatmap means a lower probability for the landmark located at that position, and vice versa. Therefore, for landmark detection, though the precision of the landmark location is important, the probability distribution of each landmark is equally crucial. A threshold to filter the heatmap results is used, and then the position of highest value on the filtered heatmap is detected. When increasing the threshold value, some landmarks’ position with low probability (smaller than the threshold) can be filtered out, and thus these landmarks cannot be retrieved. The retrieval rate is counted in terms of the probability threshold for each of the 6 landmarks and the results are presented in Table 6. As shown, compared with other deep learning architectures, HRNet performs consistently better on detection of all the 6 landmarks. Even when the threshold is set to 0.6, HRNet still achieves a good landmark retrieval rate.
Normally, the predicted location of the landmark with a low probability is not accurate and may have negative impact on the disease analysis. The simplest method is to use a threshold to filter out the bad results with low probability, but if the value of threshold is too high, some good predictions can be filtered out which will deteriorate the model performance. As a result, there is a trade-off between the probability thresholds and model performance. To decide a rational threshold value, a plot of the curve between the retrieval rate/MME/MED and probability threshold in Figure 12 is used. According to the figure, we 0.6 is selected as the probability threshold to filter out the bad predictions. With this threshold value, the HRNet used in the present invention can still retrieve over 90%of the landmarks and the predicted landmarks with low probability can be removed as well.
4.2.2 RCI synthesis module.
The RCI synthesis module consists of a generator and a discriminator. In this study, the generator adopts a ResNet backbone with 9 residual blocks while the discriminator uses the PatchGAN framework, and the model consists of 5 convolutional layers. Originally, the performance of the generator is not only related to its own structure, but also closely related to the discriminator. Therefore, ablation studies are conducted to evaluate different combinations and architectures of generator and discriminator to obtain the optimal generation results. Table 7 evaluates the RCI synthesis results using several quantitative metrics. There are 7 commonly used quantitative measurements adopted in the table to indicate the synthesized results of generator. PSNR (peak signal-to-noise ratio) 46 and SSIM (structural similarity) 47 are two classical pixel-wise metrics which measure the pixel-wise difference between the synthesized image and reference image. FID (Fréchet inception distance) 48 compares the
Gaussian distribution of InceptionV3 activations between the generated images and ground truth images. VIF (visual information fidelity) 49 is a full-reference image quality metric based on natural scene statistics to evaluate the fidelity of the generated images. LPIPS (learned perceptual image patch similarity) 50 measures the l2 distance between the AlexNet activations of the generated images and ground truth images. NIQE (natural image quality evaluator) 51 is a no-reference image quality metric which utilizes measurable deviations from statistical regularities observed in natural images. Mathematically, it calculates the Mahalanobis distance between two multi-variate Gaussian models of 36 features extracted from the generated images. BRISQUE (blind/referenceless image spatial quality evaluator) 52 is another no-reference image quality metric which extracts the features from a Gaussian model of mean subtracted contrast normalized coefficients obtained from the generated images. These features are then fed into SVM to acquire the quality score.
The selected 7 metrics evaluate the model performance in multiple aspects. From Table 7, the pixel-wise quality metrics such as PSNR and SSIM are inappropriate to evaluate the quality of the synthesized RCI, since pixel-level accurate reconstruction of the ground truth X-rays is not pursued. Also, from the results, it can be see that the best results in terms of PSNR and SSIM are totally different, which also reflects that these two metrics are not stable to evaluate the results in this task. On the contrary, the other distribution-based and feature-based image quality metrics, i.e., VIF, FID, LPIPS, NIQE, and BRISQUE are performed much consistently.
An ablation study exams 4 generators and 3 discriminators, that is in total 12 different combinations for the GAN framework. For the generator, 2 architectures were studied, namely ResNet and UNet with different hierarchical structures (6 and 9 blocks for ResNet architecture; 7 and 8 scale levels for UNet architecture) . For discriminator, 2 architectures were also studied, namely PatchGAN and PixelGAN. The PatchGAN is a model which outputs a matrix of bool values, and each value corresponds to a small patch in the original image. PixelGAN is a special case of PatchGAN whose output bool matrix is with the same spatial size as the input image. Therefore, each value in the output matrix corresponds to a single pixel in the input image. According to the experiments, for the same generator, the discriminator adopts PatchGAN outperforms the one adopts PixelGAN, and for the same discriminator, the generator adopts ResNet outperforms the one adopts UNet. From Table 7, it can be seen that the generator with a ResNet-9blocks architecture combined with a discriminator with 5 convolutional layers PatchGAN architecture achieves the best performance.
5. Discussion
Medical image synthesis (MIS) has been fully studied to mutually transform inter-and intra-modality medical images with deep learning. However, this study showed a few crucial and novel points for this topic. First, it can be demonstrated that an AI-powered system has the potential to synthesize RCI from optical images (RGBD images) . This finding can potentially provide a radiation-free screening and analysis technique to assist AIS detection and diagnosis. Second, the experimental results show that the analytic results obtained from synthesized RCIs are coherent with the results from real radiographs
(GT) . In this regard, the developed light-based RCI synthesis system has the capacity to generate reliable RCI to substantively assist the clinical diagnosis procedure for AIS.
An accurate detection of the 6 anatomical landmarks with high confidence is crucial for RCI synthesis, since it provides useful information to assist the identification of the spine morphology. With this in mind, the bareback landmark detection module was designed to output the heatmaps of the 6 bareback landmarks, indicating both the landmark position and probability of its presence. The validity and generalization of the bareback landmark detection module was examined on the prospective test dataset quantitatively and statistically, and it achieved less than 4 pixels error on average for each back landmark in terms of both MED and MMD measurements (Table 2) . In addition, the model can accurately predict the landmarks from the nude back images of patients with different severity levels of AIS as shown in Figure 3b.
To find the optimal deep learning architecture for back landmark detection, 4 different classical frameworks were compared Considering the number of parameters can impact the capacity of the model37, for fairness, the models were adjusted accordingly to make them have roughly equivalent number of parameters by changing the number of filters, number of residual blocks, kernel and stride size of convolutional layers etc. Supplementary Table 2 provides a summary of the models. In addition to model parameters, the inference memory and floating-point operations (FLOPs) of different models were also compared. The metric inference memory refers to the required memory when using the model for testing while FLOPs is used to measure the model complexity. As shown, the number of parameters in different models were controlled to be similar to the capacity of different models. By minimizing the impacts of model capacity, model performance is then largely related to its architecture. Supplementary Table 3 compares the performance of different models for landmark detection. As shown, HRNet outperforms the other 3 deep learning models by a large margin, especially for the detection of left and right inferior scapular angle landmarks. Combining Supplementary Table 2 and Supplementary Table 3, HRNet uses a modest size of inference memory and relatively smaller computations (FLOPs) to achieve the best performance.
To evaluate the quality of the predicted landmark heatmaps, a threshold as used to filter the heatmap results. When increasing the threshold value, some landmark positions with low probability (smaller than the threshold) are filtered out and cannot be retrieved. Supplementary Table 4 illustrates the retrieval rate under different probability threshold for each of the 6 landmarks. As shown, compared with other deep learning architectures, HRNet performs consistently better on detection of all 6 landmarks with different probability threshold. A landmark located at the position with a low probability value is usually not accurate. The probability threshold can help to filter out such inaccurate predictions. However, if the threshold is too high, some good predictions can be filtered out which will deteriorate the model performance. To decide a rational threshold, the curve between the retrieval rate/MME/MED and probability threshold was plotted. As shown in Figure 12, 0.6 was selected as the probability threshold to filter out bad predictions. With this threshold, the HRNet can still retrieve over 90%of the landmarks and the predicted landmarks with low probabilities can be removed as well.
The quality of synthesized RCIs were assessed in two aspects, namely the image quality and the
usability of the synthesized images in analytic clinical applications. Given the misalignment between RGBD images and radiographs, it was not appropriate to use pixel-wise image quality metrics (e.g., PSNR and SSIM) , to evaluate the model performance. In this study, 5 metrics were introduced which measure the quality of the synthesized image in terms of the distribution or properties of extracted image features (FID, LPIPS and NIQE) , image fidelity (VIF) , and image spatial quality (BRISQUE) . Supplementary Table 5 compares the performance of different metrics on the synthesized RCIs. As shown, both PSNR and SSIM had low values, and the results were not consistent. In comparison, the measuring results obtained in terms of FID, LPIPS, NIQE, VIF, and BRISQUE were consistent, especially for the best two combinations of the generator and discriminator.
The clinical quality and usability of synthesized RCIs were assessed in multiple analytic clinical applications, including the severity grading, curve type identification, and CA prediction. According to the data demographic in Figure 1c, moderate patients take up about 60%of the participants. The second largest population is normal-mild (about 30%) , and severity group is the minority (about 10%) . Correspondingly, from the synthesized RCIs, clinicians can distinguish the moderate and normal-mild patients with high sensitivity (>95%) , while distinguishing the severity patients with relatively lower sensitivity (0.909) , as shown in Table 3. Even so, the severity grading results are still satisfactory in terms of the high accuracy, and the performance on curve type classification can also validate the good quality of the synthesized RCIs.
The agreement of the CA prediction results was examined using linear regression analysis, and the results indicated a strong correlation with the CAs predicted from GT radiographs as shown in Figure 4: R2 = 0.984; p-value < 0.0001. In addition, the difference between the predicted CAs from two spine alignments sources are close (mean difference = -0.86) as shown in Figure 4. In terms of the visual results presented in Figure 5, the alignment of spine has been clearly synthesized, although there are mismatches in some vertebral regions, the CA can be accurately predicted.
The study has a few limitations. The performance of the CA prediction largely depends on the quality of the synthesized RCIs which further relies on the accuracy of the detected anatomical landmarks. Considering the nude back features of obese patients may not be obvious enough to obtain accurate landmarks, no overweight patients are recruited in this study. Another potential limitation is skin colour. Since most of the participants are Southeast Asian Chinese, the images in the dataset used for training model may not be able to accurately represent the skin colour of other population groups. In addition, the system was tested in two centres following the same procedure to collect data with the same clinical assessment standard. The performance of the model of the present invention may reduce when directly applied in another centre. It is foreshadowed that an international multi-centre trial to further assess the reliability of our system and device. To investigate the impacts of sample size, the curve-fitting method has been used. However, such a method is originally proposed to study how sample size affects classification model. Therefore, the investigation results may be not accurate reflect the impacts of sample size to our models. Nevertheless, some analysis regarding overall trends still has certain reference value. For example, when the sample size of training data decreases, the stability and performance of both models deteriorate. Furthermore, when the training sample size is very small, both
models have a high risk of overfitting.
In summary, in evaluation of the present invention, the investigators deployed the first prospectively tested auto-alignment analysis model for spine malalignment analysis using deep learning and RGBD technologies with no radiation. On further multi-centre validation in future, our platform can better assist clinicians and clinical research in large volumes.
6. References
1. Cobb J. Outline for the study of scoliosis. Instructional Course Lectures AAOS 1948; 5: 261-75.
2. Fong DYT, Cheung KMC, Wong Y-W, et al. A population-based cohort study of 394, 401 children followed for 10 years exhibits sustained effectiveness of scoliosis screening. The Spine Journal 2015; 15 (5) : 825-33.
3. Luk KD, Lee CF, Cheung KM, et al. Clinical effectiveness of school screening for adolescent idiopathic scoliosis: a large population-based retrospective cohort study. Spine 2010; 35 (17) : 1607-14.
4. Weinstein SL, Dolan LA, Spratt KF, Peterson KK, Spoonamore MJ, Ponseti IV. Health and function of patients with untreated idiopathic scoliosis: a 50-year natural history study. JAMA 2003; 289 (5) : 559-67.
5. Yang J, Zhang K, Fan H, et al. Development and validation of deep learning algorithms for scoliosis screening using back images. Communications biology 2019; 2 (1) : 1-8.
6. Cheung JPY, Zhang T, Bow C, Kwan K, Sze KY, Cheung KMC. The crooked rod sign: a new radiological sign to detect deformed threads in the distraction mechanism of magnetically controlled growing rods and a mode of distraction failure. Spine 2020; 45 (6) : E346-E51.
7. P, Kreitz BG, Cassidy JD, Dzus AK, Martel J. A study of the diagnostic accuracy and reliability of the Scoliometer and Adam's forward bend test. Spine 1998; 23 (7) : 796-802.
8. Fong DYT, Lee CF, Cheung KMC, et al. A meta-analysis of the clinical effectiveness of school scoliosis screening. Spine 2010; 35 (10) : 1061-71.
9. Chung N, Cheng Y-H, Po H-L, et al. Spinal phantom comparability study of Cobb angle measurement of scoliosis using digital radiographic imaging. Journal of Orthopaedic Translation 2018; 15: 81-90.
10. Grauers A, Einarsdottir E, Gerdhem P. Genetics and pathogenesis of idiopathic scoliosis. Scoliosis and Spinal Disorders 2016; 11 (1) : 1-7.
11. Rodemann HP, Blaese MA. Responses of normal cells to ionizing radiation. Seminars in radiation oncology; 2007: Elsevier; 2007. p. 81-8.
12. Brody AS, Frush DP, Huda W, Brent RL. Radiation risk to children from computed tomography. Pediatrics 2007; 120 (3) : 677-82.
13. Cheung C-WJ, Law S-Y, Zheng Y-P. Development of 3-D ultrasound system for assessment of adolescent idiopathic scoliosis (AIS) : and system validation. International Conference of the IEEE Engineering in Medicine and Biology Society; 2013: IEEE; 2013. p. 6474-7.
14. Czaprowski D, A, Sitarski D, Kotwicki T. Intra-and interobserver repeatability of the assessment of anteroposterior curvatures of the spine using Saunders digital inclinometer. Ortopedia, Traumatologia, Rehabilitacja 2012; 14 (2) : 145-53.
15. Melvin M, Sylvia M, Udo W, Helmut S, Paletta JR, Adrian S. Reproducibility of rasterstereography for kyphotic and lordotic angles, trunk length, and trunk inclination: a reliability study. Spine 2010; 35 (14) : 1353-8.
16. Perriman DM, Scarvell JM, Hughes AR, Ashman B, Lueck CJ, Smith PN. Validation of the flexible electrogoniometer for measuring thoracic kyphosis. Spine 2010; 35 (14) : E633-E40.
17. Meng N, Ge Z, Zeng T, Lam EY. LightGAN: A deep generative model for light field reconstruction. Ieee Access 2020; 8: 116052-63.
18. Meng N, Li K, Liu J, Lam EY. Light Field View Synthesis via Aperture Disparity and Warping Confidence Map. IEEE Transactions on Image Processing 2021; 30: 3908-21.
19. Freitag MT, Fenchel M, P, et al. Improved clinical workflow for simultaneous whole-body PET/MRI using high-resolution CAIPIRINHA-accelerated MR-based attenuation correction. European Journal of Radiology 2017; 96: 12-20.
20. Zhang T, Li Y, Cheung JPY, Dokos S, Wong K-YK. Learning-based coronal spine alignment prediction using smartphone-acquired scoliosis radiograph images. IEEE Access 2021; 9: 38287-95.
21. Devic S. MRI simulation for radiotherapy treatment planning. Med Phys 2012; 39 (11) : 6701-11.
22. Liu F, Jang H, Kijowski R, Bradshaw T, McMillan AB. Deep learning MR imaging-based attenuation correction for PET/MR imaging. Radiology 2018; 286 (2) : 676-84.
23. Dong X, Lei Y, Tian S, et al. Synthetic MRI-aided multi-organ segmentation on male pelvic CT using cycle consistent deep attention network. Radiotherapy and Oncology 2019; 141: 192-9.
24. Yang Q, Yan P, Zhang Y, et al. Low-dose CT image denoising using a generative adversarial network with Wasserstein distance and perceptual loss. IEEE Transactions on Medical Imaging 2018; 37 (6) : 1348-57.
25. Microsoft Azure Kinect DK. 2020. https: //azure. microsoft. com/en-us/services/kinect-dk/ (accessed 16 June 2018) .
26. Mak T, Cheung PWH, Zhang T, Cheung JPY. Patterns of coronal and sagittal deformities in adolescent idiopathic scoliosis. BMC Musculoskeletal Disorders 2021; 22 (1) : 1-10.
27. Meng N, Cheung JP, Wong K-YK, et al. An artificial intelligence powered platform for auto-analyses of spine alignment irrespective of image quality with prospective validation. EClinicalMedicine 2022; 43: 101252.
28. Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in Neural Information Processing Systems 2017; 30.
29. Sheikh HR, Bovik AC. Image information and visual quality. IEEE Transactions on Image Processing 2006; 15 (2) : 430-44.
30. Zhang R, Isola P, Efros AA, Shechtman E, Wang O. The unreasonable effectiveness of deep features as a perceptual metric. IEEE Conference on Computer Vision and Pattern Recognition; 2018; 2018. p. 586-95.
31. Mittal A, Soundararajan R, Bovik AC. Making a “completely blind” image quality analyzer. IEEE Signal Processing Letters 2012; 20 (3) : 209-12.
32. Mittal A, Moorthy AK, Bovik AC. No-reference image quality assessment in the spatial domain. IEEE Transactions on Image Processing 2012; 21 (12) : 4695-708.
33. Balki I, Amirabadi A, Levman J, et al. Sample-size determination methodologies for machine learning in medical imaging research: a systematic review. Canadian Association of Radiologists Journal 2019; 70 (4) : 344-53.
34. Wang J, Sun K, Cheng T, et al. Deep high-resolution representation learning for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 2020.
35. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. IEEE Conference on Computer Vision and Pattern Recognition; 2016; 2016. p. 770-8.
36. Isola P, Zhu J-Y, Zhou T, Efros AA. Image-to-image translation with conditional adversarial networks. IEEE Conference on Computer Vision and Pattern Recognition; 2017; 2017. p. 1125-34.
37. Zhang C, Bengio S, Hardt M, Recht B, Vinyals O. Understanding deep learning (still) requires rethinking generalization. Commun Acm 2021; 64 (3) : 107-15.
38. "Microsoft Azure Kinect DK. " https: //azure. microsoft. com/en-us/services/kinect-dk/ (accessed 16 June, 2018) .
39. "Azure Kinect SDK (K4A) . " https: //github. com/microsoft/Azure-Kinect-Sensor-SDK (accessed 2018) .
40. V. Bonnet et al., "Automatic estimate of back anatomical landmarks and 3D spine curve from a Kinect sensor, " in IEEE International Conference on Biomedical Robotics and Biomechatronics, 2016: IEEE, pp. 924-929.
41. B. Teixeira et al., "Generating synthetic x-ray images of a person from the surface geometry, " in IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 9059-9067.
42. J. Wang et al., "Deep high-resolution representation learning for visual recognition, " IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 10, pp. 3349-3364, 2020.
43. O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation, " in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2015: Springer, pp. 234-241.
44. L. -C. Florian and S. H. Adam, "Rethinking atrous convolution for semantic image segmentation, " in IEEE Conference on Computer Vision and Pattern Recognition, 2017.
45. J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation, " in IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 3431-3440.
46. A. Hore and D. Ziou, "Image quality metrics: PSNR vs. SSIM, " in International Conference on Pattern Recognition, 2010: IEEE, pp. 2366-2369.
47. Z. Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli, "Image quality assessment: from error visibility to structural similarity, " IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, 2004.
48. M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, "Gans trained by a two time-scale update rule converge to a local nash equilibrium, " Advances in Neural Information Processing Systems, vol. 30, 2017.
49. H.R. Sheikh and A.C. Bovik, "Image information and visual quality, " IEEE Transactions on Image Processing, vol. 15, no. 2, pp. 430-444, 2006.
50. R. Zhang, P. Isola, A.A. Efros, E. Shechtman, and O. Wang, "The unreasonable effectiveness of deep features as a perceptual metric, " in IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 586-595.
51. A. Mittal, R. Soundararajan, and A. C. Bovik, "Making a “completely blind” image quality analyzer, " IEEE Signal Processing Letters, vol. 20, no. 3, pp. 209-212, 2012.
52. A. Mittal, A.K. Moorthy, and A.C. Bovik, "No-reference image quality assessment in the spatial domain," IEEE Transactions on Image Processing, vol.21, no.12, pp.4695-4708, 2012.
Claims (19)
- A computerized system for providing a radiograph-comparable image (RCI) of a region of interest (ROI) spinal region of a subject, said system comprising:a Red Green Blue-Depth (RGBD) standardization module, for receiving an RGBD input image of the ROI of a subject and for providing a standardised RGBD image of said ROI;a back landmark detection module, for detecting anatomic landmarks from the standardised RGBD image of said ROI; anda landmark guided RCI synthesis module, for providing a synthesized RCI of the ROI of the spine of the subject from the detected anatomical landmarks from the back landmark detection module and from the standardised RGBD image;wherein the Red Green Blue-Depth (RGBD) standardization module is implemented with rule-based and adaptive algorithms to standardize the images, andwherein the back landmark detection module and the landmark guided RCI synthesis module utilise computerised deep learning techniques; andwherein a RGBD input image of a region of interest of the spine of a subject is acquired by an RGBD image acquisition device for providing an RGBD input image.
- A systyem according to claim 1, further comprising a a quantitative alignment analysis module, for analysing the alignment of the spine of the subject from the RCI, and wherein the quantitative alignment analysis module utilise computerised deep learning techniques, and provides a predictive output.
- A sytem according to claim 2, wherein the predictive output is an analysis of the Cobb angle of the subject.
- A system according to claim 2 or claim 3, wherein the predicitive output provides the severity of any spinal deformity of the subject, and the curve classification of the defomity of the spine of the subject.
- A system according to any one of the preceding claims, further comprising a radiographic standardisation module which is implemented with rule-based and adaptive algorithms, wherein wherein the radiographic standardisation module receives radiographic image of an ROI of a spine and provides a standardised X-ray, and wherein the RCI synthesis module further provides the synthesized RCI of the ROI of the spine of the subject from the standardised X-ray.
- A system according to claim 5, wherein the X-ray image is an X-ray image of the ROI of the subject
- A system according to any one of the preceding claims, wherein the input image is a two-dimensional (2D) image.
- A system according to cany one of clams 1 to 6, wherein the input image is a three-dimensional (3D) image.
- A system according to any one of the preceding claims, wherein the system utilsies generative Artificial Intelligence (AI) is utilised to generate spine alignment using the RGBD input image..
- A system according to claim 9, wherein said spine alignment is three-dimensional spine alignment.
- A process operable on a computerized system for providing radiograph-comparable image (RCI) of the spinal region of a subject, said system comprising a Red Green Blue-Depth (RGBD) standardization module; a back landmark detection module and a landmark guided RCI synthesis module; ; wherein the Red Green Blue-Depth (RGBD) standardization module is implemented with rule-based and adaptive algorithms to standardize the images; and wherein the back landmark detection module and the landmark guided RCI synthesis module utilise computerised deep learning techniques, said process including the steps of:(i) acquiring an RGBD input image of a region of interest of the spine by way of an RGBD image acquisition device;(ii) receiving the RGBD input image by the back landmark detection module;(iii) synthesizing an RCI by way of landmark guided RCI synthesis module, and(iv) analysing the alignment of the spine of the subject from the RCI, by way of the quantitative alignment analysis module.
- A process according to claim 11, wherein the system furtgher comprising aa quantitative alignment analysis module, for analysing the alignment of the spine of the subject from the RCI, and wherein the quantitative alignment analysis module utilise computerised deep learning techniques, and provides a predictive output.
- A process according to claim 12, wherein the predictive analysis output is the Cobb angle of the subject.
- A process according to claim 12 or claim 13, wherein the alignment analysis module provides the severity of any spinal deformity of the subject, and the curve classification of the defomity of the spine of the subject.
- A process according to any one of claims 11 to 14, further comprising a radiographic standardisation module which is implemented with rule-based and adaptive algorithms, wherein wherein the radiographic standardisation module receives radiographic image of an ROI of a spine and provides a standardised X-ray, and wherein the RCI synthesis module further provides the synthesized RCI of the ROI of the spine of the subject from the standardised X-ray.
- A process according to any one of claims 11 to 15, wherein the input image is a two-dimensional (2D) image.
- A process according to any one of claims 11 to 15, wherein the input image is a three-dimensional (3D) image.
- A process according to any one of claims 11 to 17, wherein generative Artificial Intelligence (AI) is utilised to generate spine alignment using the RGBD input image.
- A process according to claim 18, wherein said spine alignment is three-dimensional spine alignment.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| HK32023074701 | 2023-06-19 | ||
| HK32023074701.8 | 2023-06-19 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024260371A1 true WO2024260371A1 (en) | 2024-12-26 |
Family
ID=93936421
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2024/100127 Pending WO2024260371A1 (en) | 2023-06-19 | 2024-06-19 | A device, process and system for diagnosing and determining the spinal alignment of a person |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2024260371A1 (en) |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107481228A (en) * | 2017-07-28 | 2017-12-15 | 电子科技大学 | Human body back scoliosis angle measurement method based on computer vision |
| CN108697375A (en) * | 2016-02-15 | 2018-10-23 | 学校法人庆应义塾 | Spinal alignment estimating device, spinal alignment presumption method and spinal alignment program for estimating |
| US20210118134A1 (en) * | 2019-10-17 | 2021-04-22 | Posture Co., Inc. | Method and system for postural analysis and measuring anatomical dimensions from a radiographic image using machine learning |
| CN114092447A (en) * | 2021-11-23 | 2022-02-25 | 北京阿尔法三维科技有限公司 | Method, device and equipment for measuring scoliosis based on human body three-dimensional image |
| CN114081471A (en) * | 2021-11-11 | 2022-02-25 | 宜宾显微智能科技有限公司 | A method for measuring the cobb angle of scoliosis based on three-dimensional images and multi-layer perception |
| CN114173642A (en) * | 2019-06-24 | 2022-03-11 | 香港科洛华医疗科技有限公司 | Apparatus, method and system for diagnosing and tracking the development of a spinal alignment of a human |
| CN114287915A (en) * | 2021-12-28 | 2022-04-08 | 深圳零动医疗科技有限公司 | Noninvasive scoliosis screening method and system based on back color image |
-
2024
- 2024-06-19 WO PCT/CN2024/100127 patent/WO2024260371A1/en active Pending
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108697375A (en) * | 2016-02-15 | 2018-10-23 | 学校法人庆应义塾 | Spinal alignment estimating device, spinal alignment presumption method and spinal alignment program for estimating |
| CN107481228A (en) * | 2017-07-28 | 2017-12-15 | 电子科技大学 | Human body back scoliosis angle measurement method based on computer vision |
| CN114173642A (en) * | 2019-06-24 | 2022-03-11 | 香港科洛华医疗科技有限公司 | Apparatus, method and system for diagnosing and tracking the development of a spinal alignment of a human |
| US20210118134A1 (en) * | 2019-10-17 | 2021-04-22 | Posture Co., Inc. | Method and system for postural analysis and measuring anatomical dimensions from a radiographic image using machine learning |
| CN114081471A (en) * | 2021-11-11 | 2022-02-25 | 宜宾显微智能科技有限公司 | A method for measuring the cobb angle of scoliosis based on three-dimensional images and multi-layer perception |
| CN114092447A (en) * | 2021-11-23 | 2022-02-25 | 北京阿尔法三维科技有限公司 | Method, device and equipment for measuring scoliosis based on human body three-dimensional image |
| CN114287915A (en) * | 2021-12-28 | 2022-04-08 | 深圳零动医疗科技有限公司 | Noninvasive scoliosis screening method and system based on back color image |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111417980B (en) | Three-dimensional medical image analysis method and system for identification of vertebral fractures | |
| JP5603859B2 (en) | Method for controlling an analysis system that automatically analyzes a digitized image of a side view of a target spine | |
| Pietka et al. | Integration of computer assisted bone age assessment with clinical PACS | |
| CN114287915A (en) | Noninvasive scoliosis screening method and system based on back color image | |
| CN112116004B (en) | Focus classification method and device and focus classification model training method | |
| US10872408B2 (en) | Method and system for imaging and analysis of anatomical features | |
| US9123101B2 (en) | Automatic quantification of asymmetry | |
| CN108697375A (en) | Spinal alignment estimating device, spinal alignment presumption method and spinal alignment program for estimating | |
| US20160228008A1 (en) | Image diagnosis device for photographing breast by using matching of tactile image and near-infrared image and method for aquiring breast tissue image | |
| CHOI et al. | CNN-based spine and Cobb angle estimator using moire images | |
| US7539332B1 (en) | Method and system for automatically identifying regions of trabecular bone tissue and cortical bone tissue of a target bone from a digital radiograph image | |
| US20250173860A1 (en) | Systems, devices, and methods for spine analysis | |
| Meng et al. | Radiograph-comparable image synthesis for spine alignment analysis using deep learning with prospective clinical validation | |
| Hareendranathan et al. | Artificial intelligence to automatically assess scan quality in hip ultrasound | |
| Kim et al. | Intra-and inter-reader reliability of semi-automated quantitative morphometry measurements and vertebral fracture assessment using lateral scout views from computed tomography | |
| WO2022017238A1 (en) | Automatic detection of vertebral dislocations | |
| Badarneh et al. | Semi-automated spine and intervertebral disk detection and segmentation from whole spine MR images | |
| Giannoglou et al. | Review of advances in cobb angle calculation and image-based modelling techniques for spinal deformities | |
| WO2024260371A1 (en) | A device, process and system for diagnosing and determining the spinal alignment of a person | |
| Hassan et al. | Ensemble learning of deep CNN models and two stage level prediction of Cobb angle on surface topography in adolescents with idiopathic scoliosis | |
| KR102258070B1 (en) | Method for evaluating foot type and device evaluating foot type using the same | |
| Jo et al. | Preoperative rotator cuff tear prediction from shoulder radiographs using a convolutional block attention module-integrated neural network | |
| Arivudaiyanambi et al. | Classification of markerless 3D dorsal shapes in adolescent idiopathic scoliosis patients using machine learning approach | |
| CN120259300B (en) | Acetabular cup prediction system and method based on computed tomography images | |
| Lang et al. | Anatomical landmark detection on bi-planar radiographs for predicting spinopelvic parameters |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24825264 Country of ref document: EP Kind code of ref document: A1 |