WO2022177044A1 - Apparatus and method for generating high-resolution chest x-ray image by using attention-mechanism-based multi-scale conditional generative adversarial neural network - Google Patents
Apparatus and method for generating high-resolution chest x-ray image by using attention-mechanism-based multi-scale conditional generative adversarial neural network Download PDFInfo
- Publication number
- WO2022177044A1 WO2022177044A1 PCT/KR2021/002628 KR2021002628W WO2022177044A1 WO 2022177044 A1 WO2022177044 A1 WO 2022177044A1 KR 2021002628 W KR2021002628 W KR 2021002628W WO 2022177044 A1 WO2022177044 A1 WO 2022177044A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- chest
- ray image
- discriminator
- branch
- generating
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B6/00—Apparatus or devices for radiation diagnosis; Apparatus or devices for radiation diagnosis combined with radiation therapy equipment
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/094—Adversarial learning
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H30/00—ICT specially adapted for the handling or processing of medical images
- G16H30/40—ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
Definitions
- the present invention relates to an apparatus and method for generating a high-resolution chest X-ray image using a multi-scale conditional adversarial generating neural network based on a mechanism of interest.
- a common test for early detection of heart and lung diseases is chest X-ray, and the X-ray is transmitted through the chest area to determine the presence or absence of heart and lung-related diseases.
- CT Computed Tomography
- MRI Magnetic Resonance Imaging
- chest X-ray examination is relatively inexpensive, can be taken in a short time, and has the advantage of low exposure dose.
- a balanced data set is very important for learning convergence in learning the current supervised learning-based AI model well.
- most of the medical data collected due to the difference in disease prevalence is numerically unbalanced, and when used as training data without pre-processing, it may overfit to a specific disease, resulting in undesirable convergence results.
- learning progresses well for frequently occurring diseases (atelectasis, pleural effusion, infiltration, etc.), whereas other diseases (pneumonia, cardiac hypertrophy, etc.) have relatively poor learning performance.
- Korean Patent Registration No. 10-2119056 discloses a method and apparatus for learning a medical image based on a generative adversarial neural network.
- the present invention provides an apparatus and method for generating a chest X-ray image that can generate a high-resolution image that reflects the characteristics of each disease well with only one network for the purpose of solving the problem of numerical imbalance of chest X-ray image data sets for multiple diseases. would like to provide
- an embodiment of the present invention receives a latent variable and a condition variable for a characteristic of a specific disease and up-sampling for each branch to perform up-sampling for each of the specific resolutions of the specific
- a model building unit for constructing a multi-scale conditional adversarial generation neural network including a generator generating a plurality of chest X-ray images indicating a disease and a discriminator for discriminating whether the plurality of chest X-ray images are authentic, and the multi-scale conditional adversarial generation
- the apparatus for generating a chest X-ray image comprising an image generator for generating a chest X-ray image representing the specific disease with a resolution greater than or equal to a preset value by inputting the latent variable and the condition variable for the characteristic of a specific disease into a neural network can provide
- another embodiment of the present invention receives a latent variable and a condition variable for the characteristic of a specific disease, and up-sampling for each quarter to obtain a plurality of chest X-ray images representing the specific disease with different resolutions.
- generating a multi-scale conditionally adversarial generating neural network including a generating generator and a discriminator for discriminating whether or not the plurality of chest X-ray images are authentic; It is possible to provide a method for generating a chest X-ray image, which includes generating a chest X-ray image indicating the specific disease by inputting a condition variable and having a resolution greater than or equal to a preset value.
- any one of the above-described problem solving means of the present invention it is possible to control the target disease of the chest X-ray image generated by adding a condition variable as an input, thereby eliminating the inefficiency of learning as well as learning when the learning is extremely unbalanced. This impossible problem can be solved.
- FIG. 1 is a diagram illustrating an apparatus for generating a chest X-ray image according to an embodiment of the present invention.
- FIG. 2 is a diagram illustrating a network structure of a multi-scale conditional adversarial generating neural network according to an embodiment of the present invention.
- FIG. 3 is a diagram illustrating a loss graph of a generator and a discriminator according to an embodiment of the present invention.
- FIG. 5 is a view showing a chest X-ray image generated as a result of comparing the performance of the experimental example and the comparative example of the present invention.
- FIG. 6 is a flowchart illustrating a method for generating a chest X-ray image according to an embodiment of the present invention.
- a "part" includes a unit realized by hardware, a unit realized by software, and a unit realized using both.
- one unit may be implemented using two or more hardware, and two or more units may be implemented by one hardware.
- Some of the operations or functions described as being performed by the terminal or device in this specification may be instead performed by a server connected to the terminal or device. Similarly, some of the operations or functions described as being performed by the server may also be performed in a terminal or device connected to the corresponding server.
- FIG. 1 is a diagram illustrating an apparatus for generating a chest X-ray image according to an embodiment of the present invention
- FIG. 2 is a diagram illustrating a network structure of a multi-scale conditional adversarial generation neural network according to an embodiment of the present invention.
- the X-ray image generating apparatus 1 may generate a high-resolution image by well reflecting detailed characteristics of the ribs, diaphragm, lung, heart, etc. in the chest X-ray image through a multi-scale conditional adversarial generating neural network.
- a high-resolution image is essential to reflect detailed characteristics of organs in a chest X-ray image.
- a high-resolution image can be generated by learning the multi-scale image distribution from a low-resolution image to a high-resolution image.
- the multi-scale conditional adversarial generation neural network can generate images for multiple diseases with only one network by adding a condition variable, which is a disease condition control factor.
- the shape of the skeleton or organs is different for each patient (patient-specific), and the distribution of the intensity level is unclear due to peripheral organs such as blood vessels and noise, so it is difficult to determine the disease by only considering regional characteristics. .
- the problem of long-distance dependence within the image is essential because the diseased region may span the entire image or several may be distributed far apart.
- An example of the X-ray image generating apparatus 1 may include a mobile terminal capable of wired/wireless communication as well as a personal computer such as a desktop or a notebook computer.
- a mobile terminal is a wireless communication device that guarantees portability and mobility, and includes not only smartphones, tablet PCs, and wearable devices, but also Bluetooth (BLE, Bluetooth Low Energy), NFC, RFID, Ultrasonic, infrared, and Wi-Fi ( WiFi) and Li-Fi (LiFi) may include various devices equipped with a communication module.
- BLE Bluetooth Low Energy
- NFC NFC
- RFID Ultrasonic, infrared
- WiFi Wi-Fi
- Li-Fi Li-Fi
- the X-ray image generating apparatus 1 is not limited to the shape shown in FIG. 1 or those exemplified above.
- the chest X-ray image generating apparatus 1 may include a model building unit 100 and an image generating unit 110 .
- the model building unit 100 may build a multi-scale conditional adversarial generating neural network 200 including a generator 201 and discriminators 203 , 205 , and 207 .
- the multi-scale conditional adversarial generating neural network 200 may be, for example, a network for learning multi-scale image distribution from a low-resolution image to a high-resolution image by extending StackGAN++ and LSGAN.
- the multi-scale conditional adversarial generation neural network 200 is composed of one generator 201 and a plurality of discriminators 203, 205, and 207 to form a tree-structured network.
- the generator 201 of the multi-scale conditional adversarial generative neural network 200 receives a latent variable and a condition variable for the characteristic of a specific disease, and up-sampling it for each branch to obtain a plurality of units representing a specific disease with different resolution.
- a chest X-ray image can be generated.
- the generator 201 may learn a multi-scale image distribution from a low-resolution image to a high-resolution image for each branch in the generator 201 to gradually generate a high-quality chest X-ray image.
- condition variable allows control of the chest X-ray image to be generated on behalf of a part of the latent variable for each disease.
- the condition variable may be a one-hot encoding value for any one of a plurality of classes in order to generate a plurality of chest X-ray images indicating a plurality of chest diseases.
- one-hot encoding values for 8 classes may be assigned to the condition variable in order to generate chest X-ray images corresponding to 8 representative chest diseases.
- the initial distribution expressed by the standard normal distribution may be approximated to data within the standard data distribution of the corresponding scale for each quarter while passing through several hidden layers of the generator 201 together with the condition variable.
- the discriminators 203 , 205 , and 207 of the multi-scale conditional adversarial generation neural network 200 may discriminate whether a plurality of chest X-ray images are authentic or not.
- the discriminators 203 , 205 , and 207 can further discriminate whether the plurality of chest X-ray images generated by the generator 201 satisfy the condition variable, unlike the discriminator of the conventional adversarial generative neural network. .
- the discriminator 203, 205, 207 determines whether the multi-scale image generated by the generator 201 is well made by the discriminator 203, 205, 207 corresponding to each branch of the generator 201. By evaluating, it guides the constructor 201 to be optimized.
- the generator 201 may be divided into three branches to generate a chest X-ray image having a high resolution at a low resolution.
- the generator 201 includes a first branch 209 generating a chest X-ray image having a first resolution, a second branch 211 generating a chest X-ray image having a second resolution higher than the first resolution, and A third branch 213 for generating a chest X-ray image having a third resolution higher than the second resolution may be included.
- the generator 201 generates a chest X-ray image having the primary color and structure by approximating the image distribution of the first resolution in the first branch, and the chest X-ray image having the primary color and structure in the second branch
- a chest X-ray image expressing detailed information may be generated by approximating the image distribution of the second resolution higher than the first resolution by receiving the gradient of .
- the first branch 209 is a sub-generator for a low-resolution image 64x64, including four up-block layers, an attention layer, and a 3*3 convolution layer. ) may be included.
- a nearest-neighbor method may be used for upsampling in order to mitigate the occurrence of checkerboard-artifact.
- the sub-generators of the second branch 211 and the third branch 213 may include a joining layer and two residual layers and an up-block layer.
- a 128 ⁇ 128 chest X-ray image and a 256 ⁇ 256 chest X-ray image may be output through 3 ⁇ 3 convolution after passing through the four layers.
- Each quarter discriminator (203, 205, 207) consists of a down-sampling layer, and for calculating conditional and unconditional loss functions, the input part of the previous step of the last layer is divided into two parts and the condition variable is combined in only one place. can do it
- the discriminators 203 , 205 and 207 are the first discriminator 203 , which determines whether the chest X-ray image generated in the first branch 209 is authentic or not, and the chest X-ray image generated in the second branch 211 .
- Equation 1 the loss function of the discriminator of the i-th generator 201 branch is It can be defined as Equation 1.
- the values of a and b are set to 0 and 1, respectively.
- the generator 201 is expressed as the sum of the discriminator loss functions for each branch, and the loss function may be defined as in Equation (2).
- the value of d is set to 1.
- the generator 201 is trained in a direction of approximating the distribution of the non-conditional image and the conditional image with respect to images of high resolution from low resolution.
- modeling the image distribution in multi-scale has the advantage of transferring the gradient to the initial layer well because the gradient for each scale is generated and transmitted.
- these features play a key role in stabilizing network learning, making it possible to generate high-resolution images.
- FIG. 3 is a diagram illustrating a loss graph of a generator and a discriminator according to an embodiment of the present invention.
- Figure 3 (a) shows the loss graph of the generator 201 in the learning phase
- Figure 3 (b) shows the loss graph of the discriminators (203, 205, 207).
- the x-axis means iteration
- the y-axis means loss.
- FIG. 4 is a view showing the Frechet Densnet distance as a performance comparison result of the Experimental Example and Comparative Example of the present invention
- FIG. 5 is a view showing a chest X-ray image generated as a performance comparison result of the Experimental Example and Comparative Example of the present invention. to be.
- the present applicant used 112,120 chest X-ray image data for 14 heart and lung diseases provided by the NIH Clinical Center for the chest X-ray image data augmentation experiment. Among them, only 8 representative diseases (atelectasis, cardiac hypertrophy, pleural effusion, infiltration, mass, nodule, pneumonia, and pneumothorax) were used.
- the present applicant conducted two experiments for 8 representative diseases of the chest for performance comparison.
- the first experiment consists of 1) a conventional deep convolutional adversarial generating neural network and 2) generating high-resolution images (256x256) for each disease with our multi-scale conditional adversarial generating neural network.
- the performance of the network is compared with the Fre'chet DenseNet distance (FDD) (FIG. 4).
- the number of images used in the experiment is about 100,000, and when the batch size is 16, 7,000 repetitions per epoch are made.
- the discriminator was gradually stabilized after 5 epochs, and the function fluctuation was stabilized in the generator around 20 epochs (repeat 140,000).
- both models lost their balance and both models collapsed and the learning was stopped.
- the Fre'chet Inception Distance is a measure of the difference between two normal distributions, and it was proposed to compensate for the disadvantage that the Inception Score (IS) does not use the distribution of actual data.
- the FID is calculated as in Equation 3, and a smaller value means better quality.
- (m, C) and (m w , C w ) is the mean and covariance of the generated data and the actual data.
- DenseNet-121 was trained with 78,468 training data and 11,219 validation data. And based on the learned model, the performance of the model was quantitatively evaluated by obtaining the FDD values for the real image, cDCGAN, and the real image and the proposed model.
- the actual video means the remaining part that is not used for DenseNet-121 training.
- the number of image data for each class is uneven. Therefore, the number of actual images and generated images by class is 1,000 attelectasis, 575 cardiomegaly, 1,000 effusion, 1,000 infiltration, 729 mass, 774 nodule, 242 pneumonia, and 539 pneumothorax, for a total of 5,859 images. The test was performed in the same way.
- the FDD value calculated by calculating the distance between the real image and the image generated through the multi-scale conditional adversarial generating neural network of the present application is the distance between the real image and the image generated by cDCGAN, a conventional deep convolutional adversarial generating neural network. Since it is generally lower than the calculated FDD value, it can be confirmed that the multi-scale conditional adversarial generating neural network of the present application is superior.
- the second experiment is a qualitative comparison of the experimental results from the conventional adversarial generative neural network, DCGAN, and our multi-scale conditionally adversarial neural network, with actual images. The location was found and marked.
- FIG. 5 shows the results of real data and the conventional deep convolutional adversarial neural network, and the multi-scale conditional adversarial neural network of the present application.
- the specific method for distinguishing the disease is as follows.
- Atelectasis causes air loss of all or part of the lungs, usually accompanied by a decrease in lung volume, biased respiratory tract (airway), or whitening of the lungs.
- Cardiomegaly is defined as a condition in which the ventricular wall (muscle) thickens and the weight of the myocardium increases, and the length of the cardiac shadow occupies more than half of the internal length of the thoracic shadow.
- Effusion is a stagnant fluid in the pleural cavity, which shows an unbalanced increase in shading in the left and right thoracic cavities.
- Infiltration is a form of inflammatory cells gathered in normal tissue, and it shows an increase in alveolar shade and is well seen around the lungs.
- a mass is a large mass with a diameter of 3 cm or more and increased shading. It can be described anywhere in the chest, including the lungs, pleura, mediastinum (sinus), and chest wall.
- a nodule is a round lung shade with a border of 3 cm or less.
- Pneumonia is an inflammatory reaction in the lungs, mainly caused by bacterial infection, and shadows such as mesh or honeycomb are found in the image.
- pneumothorax a hole in the pleural cavity is formed and a large air sac is visible.
- the results of the conventional deep convolutional adversarial neural network have significantly lowered image resolution compared to the actual data. That is, the shape of the ribs or the shape of the lung bronchus is not clearly visible, so it is difficult to say that the main features have been properly learned.
- the results of our multi-scale conditional adversarial neural network showed image features and resolution that were indistinguishable from the actual images, and the characteristics of each disease were also learned so well that there was no significant difference from the actual images. For example, in the case of cardiac hypertrophy, compared to other diseases, the shape of the heart is significantly increased, which is a result of well learning the characteristic of the disease, 'increase in heart size'.
- FIG. 6 is a flowchart illustrating a method for generating a chest X-ray image according to an embodiment of the present invention.
- the method for generating a chest X-ray image according to an embodiment illustrated in FIG. 6 includes operations that are time-series processed by the apparatus 1 for generating a chest X-ray image illustrated in FIG. 1 . Therefore, even if omitted below, it is also applied to the method for generating a chest X-ray image performed according to the embodiment shown in FIG. 6 .
- a multi-scale conditional adversarial generation neural network including a generator and a discriminator may be generated.
- the generator may generate a plurality of chest X-ray images representing specific diseases having different resolutions by receiving the latent variable and the condition variable for the characteristic of a specific disease and performing up-sampling for each branch.
- the discriminator may discriminate whether the plurality of chest X-ray images are authentic or not.
- a chest X-ray image may be generated through the multi-scale conditional adversarial generation neural network. For example, by inputting a latent variable and a condition variable for a characteristic of a specific disease into a multi-scale conditional adversarial generating neural network, a chest X-ray image representing a specific disease with a resolution greater than or equal to a preset value may be generated.
- the chest X-ray image generating method described with reference to FIG. 6 may be implemented in the form of a computer program stored in a medium, or in the form of a recording medium including instructions executable by a computer, such as a program module executed by a computer.
- Computer-readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media.
- Computer-readable media may include computer storage media.
- Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- Primary Health Care (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Pathology (AREA)
- Radiology & Medical Imaging (AREA)
- Epidemiology (AREA)
- Optics & Photonics (AREA)
- Animal Behavior & Ethology (AREA)
- Surgery (AREA)
- Veterinary Medicine (AREA)
- Heart & Thoracic Surgery (AREA)
- High Energy & Nuclear Physics (AREA)
- Databases & Information Systems (AREA)
- Image Analysis (AREA)
Abstract
Description
본 발명은 주목 메커니즘 기반의 멀티 스케일 조건부 적대적 생성 신경망을 활용한 고해상도 흉부 X선 영상 생성 장치 및 방법에 관한 것이다.The present invention relates to an apparatus and method for generating a high-resolution chest X-ray image using a multi-scale conditional adversarial generating neural network based on a mechanism of interest.
최근 인공지능 기술의 발전으로 다양한 분야에서 이를 접목해 비약적인 성능 향상을 이루고 있다. 그 중 의료를 도메인으로 한 인공지능 기반의 많은 연구들이 활발히 진행되고 있는데, 이는 전 세계적인 고령화에 따른 인간의 기대수명 증가로 의료 수요가 확대되어 의료산업의 경제적 가치가 높아졌기 때문이다. 특히, 다양한 의료 융합연구들 중 심장, 폐를 대상으로 한 연구들이 많이 있다. 흉부 관련 질환은 환경적인 요인으로 인해 일반인들도 빈번하게 접할 수 있는 흔한 질환이지만 조기발견 후 적절한 조치가 이뤄지지 않으면 치명적일 수 있다. 때문에 타 기관에서 발생한 여러 질환들과 비교해봤을 때 상대적인 중요도가 커 흉부 기관 대상으로 한 다양한 연구가 이루어지고 있다. With the recent development of artificial intelligence technology, it has been applied in various fields to achieve dramatic performance improvement. Among them, many AI-based researches in the medical domain are being actively conducted because the economic value of the medical industry has increased due to the expansion of medical demand due to the increase in human life expectancy due to the aging of the world. In particular, among various medical convergence studies, there are many studies on the heart and lungs. Chest-related diseases are common diseases that are frequently encountered by the general public due to environmental factors, but can be fatal if appropriate measures are not taken after early detection. Therefore, compared with various diseases occurring in other organs, the relative importance is high, and various studies are being conducted on the thoracic organ.
심·폐질환을 조기에 발견하기 위한 일반적인 검사로는 흉부 X선이 있으며, X선을 흉곽 부위에 투과하여 심장과 폐 관련 질환 유무를 판단한다. 흉부 X선 검사는 전산단층촬영술(Computed Tomography, CT)이나 자기공명영상(Magnetic Resonance Imaging, MRI) 등 다른 고가검사와 비교하여 상대적으로 저렴하고 단시간에 촬영 가능하며 피폭 선량이 적다는 장점이 있다.A common test for early detection of heart and lung diseases is chest X-ray, and the X-ray is transmitted through the chest area to determine the presence or absence of heart and lung-related diseases. Compared to other expensive tests such as Computed Tomography (CT) or Magnetic Resonance Imaging (MRI), chest X-ray examination is relatively inexpensive, can be taken in a short time, and has the advantage of low exposure dose.
하지만 바쁜 의료현장에서 임상의들이 수많은 환자들의 영상을 정량적으로 정확하게 판독하는 것은 시간소모적인 작업이며 노동비용 또한 크다. 이를 보완하기 위해 최근에는 인공지능 기술을 활용하여 빠르고 정확하게 영상판독이 가능한 모듈에 대한 연구결과가 나오고 있다.However, it is a time-consuming task for clinicians to quantitatively and accurately read images of numerous patients in a busy medical field, and the labor cost is also high. To supplement this, recently, research results on a module that can read images quickly and accurately using artificial intelligence technology are coming out.
한편, 현재의 지도학습기반 인공지능 모델을 잘 학습하는데 있어 균형 있는(balanced) 데이터 셋은 학습 수렴에 있어 매우 중요하다. 그러나 질환 유병률 차이에 의해 수집된 의료 데이터는 대부분 수적으로 불균형하며, 전처리과정 없이 바로 학습 데이터로 사용할 경우 특정 질환에 과적합(overfitting)되어 원하지 않는 수렴 결과를 얻을 수 있다. 흉부 X선 영상의 경우도 마찬가지로 빈번히 발생하는 질환들(무기폐, 흉수, 침윤 등)은 학습이 잘 진행되는 반면, 그렇지 않은 질환(폐렴, 심장 비대 등)들의 경우 학습 성능이 상대적으로 좋지 못하다. On the other hand, a balanced data set is very important for learning convergence in learning the current supervised learning-based AI model well. However, most of the medical data collected due to the difference in disease prevalence is numerically unbalanced, and when used as training data without pre-processing, it may overfit to a specific disease, resulting in undesirable convergence results. Similarly, in the case of chest X-ray images, learning progresses well for frequently occurring diseases (atelectasis, pleural effusion, infiltration, etc.), whereas other diseases (pneumonia, cardiac hypertrophy, etc.) have relatively poor learning performance.
이러한 데이터 수적 불균형으로 인한 학습 성능 저하 문제를 해결하고자 최근 다양한 데이터 증강(data augmentation) 기술들이 제안되었고, 성능의 향상이 있었다. 영상 데이터의 대표적인 데이터 증강방법은 상하좌우 반전, 밝기조절 등이 있으나, 이렇게 재구성된 영상 데이터는 표준데이터를 기반으로 생성되기 때문에 적은 데이터에서는 성능 향상이 높지 않을 수 있다. 따라서 훨씬 더 광범위한 데이터 집합을 생성하기 위한 근본적인 해결책으로 표준데이터 수의 증가가 필요하다. In order to solve the problem of learning performance degradation due to such data numerical imbalance, various data augmentation techniques have been recently proposed and performance has been improved. Representative data augmentation methods of image data include up-down, left-right inversion, brightness adjustment, etc. However, since the reconstructed image data is generated based on standard data, the performance improvement may not be high for a small amount of data. Therefore, it is necessary to increase the number of standard data as a fundamental solution to create a much wider data set.
최근, 이러한 문제의 새로운 해결 방안으로 표준데이터 분포의 추정을 학습하는 적대적 생성 신경망(Generative Adversarial Network, GAN)기반의 데이터 증강 기법이 소개되었다. 이 신경망은 다양한 사례에 적용되어 실제와 거의 유사한 거짓 데이터를 만들어 데이터의 수적 불균형 문제를 해결함으로써 분류 성능이 향상됨을 입증하였다. 이와 관련하여, 한국등록특허 제10-2119056호는 생성적 적대 신경망 기반의 의료영상 학습 방법 및 장치를 개시하고 있다.Recently, as a new solution to this problem, a data augmentation technique based on a generative adversarial network (GAN) that learns to estimate the distribution of standard data has been introduced. This neural network has been applied to various cases and proved that classification performance is improved by solving the problem of numerical imbalance in the data by creating false data that is almost similar to the real one. In this regard, Korean Patent Registration No. 10-2119056 discloses a method and apparatus for learning a medical image based on a generative adversarial neural network.
그러나, 목표 클래스가 다수 개일 경우 각 클래스별로 적대적 생성 신경망을 학습해야 하기 때문에 비효율적이며 영상 수가 극단적으로 제한적일 경우 학습이 거의 불가능하다는 문제가 있었다.However, when there are multiple target classes, it is inefficient because the adversarial generative neural network must be trained for each class, and when the number of images is extremely limited, there is a problem that learning is almost impossible.
본 발명은 다수의 질환에 대한 흉부 X선 영상 데이터 셋의 수적 불균형문제를 해결을 목적으로 단 하나의 네트워크만으로도 질환 별 특징이 잘 반영된 고해상도 영상을 생성할 수 있는 흉부 X선 영상 생성 장치 및 방법을 제공하고자 한다. The present invention provides an apparatus and method for generating a chest X-ray image that can generate a high-resolution image that reflects the characteristics of each disease well with only one network for the purpose of solving the problem of numerical imbalance of chest X-ray image data sets for multiple diseases. would like to provide
다만, 본 실시예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제들로 한정되지 않으며, 또 다른 기술적 과제들이 존재할 수 있다. However, the technical problems to be achieved by the present embodiment are not limited to the technical problems described above, and other technical problems may exist.
상술한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본 발명의 일 실시예는 잠재변수 및 특정 질환의 특징에 대한 조건변수를 입력받고 각 분기마다 업샘플링(up-sampling)하여 각각 해상도가 다른 상기 특정 질환을 나타내는 복수의 흉부 X선 영상을 생성하는 생성자 및 상기 복수의 흉부 X선 영상의 진위 여부를 구별하는 판별자를 포함하는 멀티 스케일 조건부 적대적 생성 신경망을 구축하는 모델 구축부 및 상기 멀티 스케일 조건부 적대적 생성 신경망에 상기 잠재변수 및 특정 질환의 특징에 대한 조건변수를 입력하여 기설정된 값 이상의 해상도를 갖고 상기 특정 질환을 나타내는 흉부 X선 영상을 생성하는 영상 생성부를 포함하는 것인, 흉부 X선 영상 생성 장치를 제공할 수 있다.As a technical means for achieving the above-described technical problem, an embodiment of the present invention receives a latent variable and a condition variable for a characteristic of a specific disease and up-sampling for each branch to perform up-sampling for each of the specific resolutions of the specific A model building unit for constructing a multi-scale conditional adversarial generation neural network including a generator generating a plurality of chest X-ray images indicating a disease and a discriminator for discriminating whether the plurality of chest X-ray images are authentic, and the multi-scale conditional adversarial generation The apparatus for generating a chest X-ray image, comprising an image generator for generating a chest X-ray image representing the specific disease with a resolution greater than or equal to a preset value by inputting the latent variable and the condition variable for the characteristic of a specific disease into a neural network can provide
또한, 본 발명의 다른 실시예는 잠재변수 및 특정 질환의 특징에 대한 조건변수를 입력받고 각 분기마다 업샘플링(up-sampling)하여 각각 해상도가 다른 상기 특정 질환을 나타내는 복수의 흉부 X선 영상을 생성하는 생성자 및 상기 복수의 흉부 X선 영상의 진위 여부를 구별하는 판별자를 포함하는 멀티 스케일 조건부 적대적 생성 신경망을 생성하는 단계 및 상기 멀티 스케일 조건부 적대적 생성 신경망에 상기 잠재변수 및 특정 질환의 특징에 대한 조건변수를 입력하여 기설정된 값 이상의 해상도를 갖고 상기 특정 질환을 나타내는 흉부 X선 영상을 생성하는 단계를 포함하는 것인, 흉부 X선 영상 생성 방법을 제공할 수 있다.In addition, another embodiment of the present invention receives a latent variable and a condition variable for the characteristic of a specific disease, and up-sampling for each quarter to obtain a plurality of chest X-ray images representing the specific disease with different resolutions. generating a multi-scale conditionally adversarial generating neural network including a generating generator and a discriminator for discriminating whether or not the plurality of chest X-ray images are authentic; It is possible to provide a method for generating a chest X-ray image, which includes generating a chest X-ray image indicating the specific disease by inputting a condition variable and having a resolution greater than or equal to a preset value.
상술한 과제 해결 수단은 단지 예시적인 것으로서, 본 발명을 제한하려는 의도로 해석되지 않아야 한다. 상술한 예시적인 실시예 외에도, 도면 및 발명의 상세한 설명에 기재된 추가적인 실시예가 존재할 수 있다.The above-described problem solving means are merely exemplary, and should not be construed as limiting the present invention. In addition to the exemplary embodiments described above, there may be additional embodiments described in the drawings and detailed description.
전술한 본 발명의 과제 해결 수단 중 어느 하나에 의하면, 입력으로 조건 변수를 추가하여 생성되는 흉부 X선 영상의 목표 질환을 제어할 수 있어 학습의 비효율성을 제거하였을 뿐만 아니라 극단적으로 불균형한 경우 학습이 불가능한 문제를 해결할 수 있다.According to any one of the above-described problem solving means of the present invention, it is possible to control the target disease of the chest X-ray image generated by adding a condition variable as an input, thereby eliminating the inefficiency of learning as well as learning when the learning is extremely unbalanced. This impossible problem can be solved.
또한, 주목 메커니즘(attention mechanism)의 적용을 통해 국소적인 부분만이 아니라 영상 전체적으로 형태의 일관성이 보존되어 더 사실과 가까운 영상을 생성할 수 있다.In addition, through the application of an attention mechanism, it is possible to generate a more realistic image by preserving the coherence of the form as a whole, not just a local part.
도 1은 본 발명의 일 실시예에 따른 흉부 X선 영상 생성 장치를 도시한 도면이다.1 is a diagram illustrating an apparatus for generating a chest X-ray image according to an embodiment of the present invention.
도 2는 본 발명의 일 실시예에 따른 멀티 스케일 조건부 적대적 생성 신경망의 네트워크 구조를 도시한 도면이다.2 is a diagram illustrating a network structure of a multi-scale conditional adversarial generating neural network according to an embodiment of the present invention.
도 3은 본 발명의 일 실시예에 따른 생성자 및 판별자의 손실 그래프를 도시한 도면이다.3 is a diagram illustrating a loss graph of a generator and a discriminator according to an embodiment of the present invention.
도 4는 본 발명의 실험예와 비교예의 성능 비교 결과로서 프레쳇 덴스넷 거리를 나타낸 도면이다.4 is a view showing the distance of the Frechet Densnet as a result of comparing the performance of the experimental example and the comparative example of the present invention.
도 5는 본 발명의 실험예와 비교예의 성능 비교 결과로서 생성된 흉부 X선 영상을 도시한 도면이다.5 is a view showing a chest X-ray image generated as a result of comparing the performance of the experimental example and the comparative example of the present invention.
도 6은 본 발명의 일 실시예에 따른 흉부 X선 영상 생성 방법을 나타낸 흐름도이다.6 is a flowchart illustrating a method for generating a chest X-ray image according to an embodiment of the present invention.
아래에서는 첨부한 도면을 참조하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본 발명의 실시예를 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily carry out the present invention. However, the present invention may be embodied in several different forms and is not limited to the embodiments described herein. And in order to clearly explain the present invention in the drawings, parts irrelevant to the description are omitted, and similar reference numerals are attached to similar parts throughout the specification.
명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결"되어 있는 경우도 포함한다. 또한 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미하며, 하나 또는 그 이상의 다른 특징이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다. Throughout the specification, when a part is "connected" with another part, this includes not only the case of being "directly connected" but also the case of being "electrically connected" with another element interposed therebetween. . Also, when a part "includes" a certain component, it means that other components may be further included, rather than excluding other components, unless otherwise stated, and one or more other features However, it is to be understood that the existence or addition of numbers, steps, operations, components, parts, or combinations thereof is not precluded in advance.
본 명세서에 있어서 '부(部)'란, 하드웨어에 의해 실현되는 유닛(unit), 소프트웨어에 의해 실현되는 유닛, 양방을 이용하여 실현되는 유닛을 포함한다. 또한, 1 개의 유닛이 2 개 이상의 하드웨어를 이용하여 실현되어도 되고, 2 개 이상의 유닛이 1 개의 하드웨어에 의해 실현되어도 된다.In this specification, a "part" includes a unit realized by hardware, a unit realized by software, and a unit realized using both. In addition, one unit may be implemented using two or more hardware, and two or more units may be implemented by one hardware.
본 명세서에 있어서 단말 또는 디바이스가 수행하는 것으로 기술된 동작이나 기능 중 일부는 해당 단말 또는 디바이스와 연결된 서버에서 대신 수행될 수도 있다. 이와 마찬가지로, 서버가 수행하는 것으로 기술된 동작이나 기능 중 일부도 해당 서버와 연결된 단말 또는 디바이스에서 수행될 수도 있다.Some of the operations or functions described as being performed by the terminal or device in this specification may be instead performed by a server connected to the terminal or device. Similarly, some of the operations or functions described as being performed by the server may also be performed in a terminal or device connected to the corresponding server.
이하 첨부된 도면을 참고하여 본 발명의 일 실시예를 상세히 설명하기로 한다. Hereinafter, an embodiment of the present invention will be described in detail with reference to the accompanying drawings.
도 1은 본 발명의 일 실시예에 따른 흉부 X선 영상 생성 장치를 도시한 도면이고, 도 2는 본 발명의 일 실시예에 따른 멀티 스케일 조건부 적대적 생성 신경망의 네트워크 구조를 도시한 도면이다.1 is a diagram illustrating an apparatus for generating a chest X-ray image according to an embodiment of the present invention, and FIG. 2 is a diagram illustrating a network structure of a multi-scale conditional adversarial generation neural network according to an embodiment of the present invention.
X선 영상 생성 장치(1)는 멀티 스케일 조건부 적대적 생성 신경망을 통해 흉부 X선 영상 내 늑골, 횡격막, 폐, 심장 등의 세부적인 특징들을 잘 반영하여 고해상도 영상을 생성할 수 있다.The X-ray
즉, 흉부 X선 영상 내 기관들의 세부적인 특징들을 잘 반영하기 위해서는 고해상도 영상이 필수적으로 요구되는데, 본원에 따르면 저해상도 영상에서 고해상도 영상까지 멀티 스케일 영상 분포를 학습하여 고해상도 영상을 생성할 수 있다.That is, a high-resolution image is essential to reflect detailed characteristics of organs in a chest X-ray image. According to the present application, a high-resolution image can be generated by learning the multi-scale image distribution from a low-resolution image to a high-resolution image.
본원에 따르면, 멀티 스케일 조건부 적대적 생성 신경망은 질환 조건 제어인자인 조건변수를 추가해 단 하나의 네트워크만으로 다수 질환에 대한 영상을 생성할 수 있다.According to the present application, the multi-scale conditional adversarial generation neural network can generate images for multiple diseases with only one network by adding a condition variable, which is a disease condition control factor.
한편, 흉부 X선 영상의 경우 환자마다 골격이나 기관의 모습이 다르고(patient-specific) 혈관과 같은 주변기관과 잡음으로 인해 명암도 레벨의 분포가 불명확하여 지역적 특성만 고려해서는 질환이 무엇인지 판단하기 어렵다. 또한, 질환 영역이 영상 전체에 걸쳐 있거나 여러 개가 멀리 떨어져 분포할 수 있기 때문에 영상 내 장거리 종속성 문제 해결이 필수적으로 요구된다.On the other hand, in the case of chest X-ray images, the shape of the skeleton or organs is different for each patient (patient-specific), and the distribution of the intensity level is unclear due to peripheral organs such as blood vessels and noise, so it is difficult to determine the disease by only considering regional characteristics. . In addition, the problem of long-distance dependence within the image is essential because the diseased region may span the entire image or several may be distributed far apart.
본원에 따르면, 멀티 스케일 조건부 적대적 생성 신경망에 주목 메커니즘을 적용하여 생성되는 흉부 X선 영상 내 장기 종속성 문제를 해결함으로써 흉부 X선 영상의 생성 성능을 향상시킬 수 있다.According to the present application, it is possible to improve the generating performance of a chest X-ray image by solving the problem of organ dependence in a chest X-ray image generated by applying an attention mechanism to a multi-scale conditional adversarial generating neural network.
X선 영상 생성 장치(1)의 일예는 데스크탑, 노트북 등과 같은 퍼스널 컴퓨터(personal computer)뿐만 아니라 유무선 통신이 가능한 모바일 단말을 포함할 수 있다. 모바일 단말은 휴대성과 이동성이 보장되는 무선 통신 장치로서, 스마트폰(smartphone), 태블릿 PC, 웨어러블 디바이스뿐만 아니라, 블루투스(BLE, Bluetooth Low Energy), NFC, RFID, 초음파(Ultrasonic), 적외선, 와이파이(WiFi), 라이파이(LiFi) 등의 통신 모듈을 탑재한 각종 디바이스를 포함할 수 있다. 다만, X선 영상 생성 장치(1)는 도 1에 도시된 형태 또는 앞서 예시된 것들로 한정 해석되는 것은 아니다. An example of the X-ray
도 1 및 도 2를 참조하면, 흉부 X선 영상 생성 장치(1)는 모델 구축부(100) 및 영상 생성부(110)를 포함할 수 있다.1 and 2 , the chest X-ray
모델 구축부(100)는 생성자(201) 및 판별자(203, 205, 207)를 포함하는 멀티 스케일 조건부 적대적 생성 신경망(200)을 구축할 수 있다. The
멀티 스케일 조건부 적대적 생성 신경망(200)은 예를 들어 StackGAN++과 LSGAN을 확장하여 저해상도 영상에서 고해상도 영상까지 멀티 스케일 영상 분포를 학습하는 네트워크일 수 있다. The multi-scale conditional adversarial generating
멀티 스케일 조건부 적대적 생성 신경망(200)은 하나의 생성자(201)와 다수개의 판별자(203, 205, 207)로 구성되어 트리 구조의 네트워크로 이루어질 수 있다.The multi-scale conditional adversarial generation
멀티 스케일 조건부 적대적 생성 신경망(200)의 생성자(201)는 잠재변수 및 특정 질환의 특징에 대한 조건변수를 입력받고 각 분기마다 업샘플링(up-sampling)하여 각각 해상도가 다른 특정 질환을 나타내는 복수의 흉부 X선 영상을 생성할 수 있다.The
예를 들어, 생성자(201)는 생성자(201) 내 각 분기(branch)별로 저해상도 영상에서 고해상도 영상까지 멀티 스케일 영상분포를 학습해 점진적으로 고화질 흉부 X선 영상을 생성할 수 있다.For example, the
이 때, 조건변수는 잠재변수의 일부를 대신해 생성할 흉부 X선 영상을 질환별로 제어할 수 있도록 한다.In this case, the condition variable allows control of the chest X-ray image to be generated on behalf of a part of the latent variable for each disease.
조건변수는 복수의 흉부 질환을 나타내는 복수의 흉부 X선 영상을 생성하기 위해 복수의 클래스 중 어느 하나의 클래스에 대한 원-핫 인코딩(One-Hot Encoding) 값일 수 있다. 예를 들어, 본원에 따르면 대표적인 8개의 흉부 질환에 해당하는 흉부 X선 영상을 생성하기 위해 조건변수에는 8개의 클래스에 대한 원-핫 인코딩 값이 할당될 수 있다.The condition variable may be a one-hot encoding value for any one of a plurality of classes in order to generate a plurality of chest X-ray images indicating a plurality of chest diseases. For example, according to the present disclosure, one-hot encoding values for 8 classes may be assigned to the condition variable in order to generate chest X-ray images corresponding to 8 representative chest diseases.
즉, 표준 정규 분포로 표현된 초기 분포는 조건변수와 함께 생성자(201)의 여러 은닉 층(hidden layer)을 거치면서 각 분기별마다 해당 스케일의 표준 데이터 분포 내 데이터로 근사화될 수 있다.That is, the initial distribution expressed by the standard normal distribution may be approximated to data within the standard data distribution of the corresponding scale for each quarter while passing through several hidden layers of the
또한, 멀티 스케일 조건부 적대적 생성 신경망(200)의 판별자(203, 205, 207)는 복수의 흉부 X선 영상의 진위 여부를 구별할 수 있다. 뿐만 아니라, 판별자(203, 205, 207)는 종래의 적대적 생성 신경망의 판별자와 달리 생성자(201)가 생성한 복수의 흉부 X선 영상이 조건변수를 만족하는지 여부를 추가로 구별할 수 있다.In addition, the
즉, 판별자(203, 205, 207)는 생성자(201)의 각 분기별에 대응하는 판별자(203, 205, 207)가 생성자(201)에 의해 생성된 멀티 스케일 영상이 잘 만들어졌는지 여부를 평가함으로써 생성자(201)가 최적화 될 수 있도록 인도한다.That is, the
생성자(201)는 총 3개의 분기로 나눠져 저해상도에서 고해상도의 흉부 X선 영상을 생성할 수 있다.The
즉, 생성자(201)는 제 1 해상도를 갖는 흉부 X선 영상을 생성하는 제 1 분기(209), 제 1 해상도보다 높은 제 2 해상도를 갖는 흉부 X선 영상을 생성하는 제 2 분기(211) 및 제 2 해상도보다 높은 제 3 해상도를 갖는 흉부 X선 영상을 생성하는 제 3 분기(213)를 포함할 수 있다.That is, the
예를 들어, 생성자(201)는 제 1 분기에서 제 1 해상도의 영상 분포를 근사화하여 기본 색상 및 구조를 갖는 흉부 X선 영상을 생성하고, 제 2 분기에서 기본 색상 및 구조를 갖는 흉부 X선 영상의 경사(gradient)를 전달받아 제 1 해상도보다 높은 제 2 해상도의 영상 분포를 근사화하여 세부 정보를 표현하는 흉부 X선 영상을 생성할 수 있다.For example, the
예를 들어, 제 1 분기(209)는 저해상도 영상(64x64)을 위한 서브-생성자로서 4개의 업-블록층(up-block layer), 주목층(attention layer) 및 3*3 합성곱층(convolution layer)을 포함할 수 있다. 여기서, 업-블록층에는 체커보드 인공물(checkerboard-artifact) 발생을 완화시키기 위해 업샘플링으로 최근접 이웃(nearest-neighbor) 방법이 사용될 수 있다.For example, the
제 2 분기(211) 및 제 3 분기(213)의 서브-생성자는 결합(joining) 층과 두 번의 잔차층(residual layer) 및 업-블록층(up-block layer)을 포함할 수 있다. 제 2 분기(211) 및 제 3 분기(213)에서는 4단계의 층을 통과한 후 3x3 합성곱을 거쳐 각각 128x128 흉부 X선 영상과 256x256 흉부 X선 영상을 출력할 수 있다.The sub-generators of the
각 분기별 판별자(203, 205, 207)는 다운샘플링(down-sampling) 층으로 구성되며 조건과 비조건 손실함수 계산을 위해 마지막 층 전단계의 입력부분을 두 부분으로 나누어 한 곳에만 조건변수를 결합시킬 수 있다.Each quarter discriminator (203, 205, 207) consists of a down-sampling layer, and for calculating conditional and unconditional loss functions, the input part of the previous step of the last layer is divided into two parts and the condition variable is combined in only one place. can do it
판별자(203, 205, 207)는 제 1 분기(209)에서 생성된 흉부 X선 영상의 진위 여부를 판별하는 제 1 판별자(203), 제 2 분기(211)에서 생성된 흉부 X선 영상의 진위 여부를 판별하는 제 2 판별자(205) 및 제 3 분기(213)에서 생성된 흉부 X선 영상의 진위 여부를 판별하는 제 3 판별자(207)를 포함할 수 있다.The
즉, 생성자(201)의 총 m개의 분기에 대하여 순차적으로 저해상도 영상 판별자(i=1)와 고해상도 영상 판별자(i=m)가 있으며, i번째 생성자(201) 분기의 판별자의 손실함수는 수학식 1과 같이 정의될 수 있다. 여기서, a, b 값은 각각 0, 1로 설정된다.That is, for a total of m branches of the
수학식 1에서, s
i는 생성자로부터 생성된 크기별 샘플을 의미한다. 즉, s
i = G
i(h
i)이고 i는 0, 1, ..., m-1의 범위를 갖는다. 또한 h
i = F(h
i-1, z)이며, 여기서 h
i는 G
i의 은닉 피처(hidden feature)일 수 있다. In
생성자(201)의 경우는 각 분기별로 판별자 손실함수들의 합으로 표현되며, 손실함수는 수학식 2와 같이 정의될 수 있다. 여기서, d 값은 1로 설정된다.The
즉, 생성자(201)는 저해상도에서 고해상도의 영상들에 대하여 비조건부 영상과 조건부 영상의 분포를 근사화하는 방향으로 학습된다.That is, the
이처럼 멀티 스케일에서 영상 분포를 모델링하게 되면 각 스케일별 경사가 발생되어 전달되기 때문에 초기 층까지 경사를 잘 전달한다는 장점을 갖고 있다. 결국 이러한 특징은 네트워크 학습을 안정화시키는데 핵심적인 역할을 해주어 고해상도 영상 생성이 가능하게 된다.In this way, modeling the image distribution in multi-scale has the advantage of transferring the gradient to the initial layer well because the gradient for each scale is generated and transmitted. Ultimately, these features play a key role in stabilizing network learning, making it possible to generate high-resolution images.
도 3은 본 발명의 일 실시예에 따른 생성자 및 판별자의 손실 그래프를 도시한 도면이다.3 is a diagram illustrating a loss graph of a generator and a discriminator according to an embodiment of the present invention.
도 3의 (a)는 학습단계에서 생성자(201)의 손실 그래프를 나타내며, 도 3의 (b)는 판별자(203, 205, 207)의 손실 그래프를 나타낸다. 여기서, x축은 반복(iteration), y축은 손실(loss)을 의미한다.Figure 3 (a) shows the loss graph of the
두 신경망은 서로 경쟁하는 관계로 각 신경망 입장에서는 손실 값이 커야 올바르게 학습되고 있음을 의미한다.Since the two neural networks compete with each other, it means that each neural network is properly trained only when the loss value is large.
도 4는 본 발명의 실험예와 비교예의 성능 비교 결과로서 프레쳇 덴스넷 거리를 나타낸 도면이고, 도 5는 본 발명의 실험예와 비교예의 성능 비교 결과로서 생성된 흉부 X선 영상을 도시한 도면이다.4 is a view showing the Frechet Densnet distance as a performance comparison result of the Experimental Example and Comparative Example of the present invention, and FIG. 5 is a view showing a chest X-ray image generated as a performance comparison result of the Experimental Example and Comparative Example of the present invention. to be.
본 출원인은 흉부 X선 영상의 데이터 증강 실험을 위하여 NIH Clinical Center에서 제공한 14개 심, 폐질환에 대한 112,120건의 흉부 X선 영상 데이터를 사용하였다. 그 중 8개의 대표 질환(무기폐, 심장 비대, 흉수, 침윤, 종괴, 결절, 폐렴, 기흉)만을 사용하였다.The present applicant used 112,120 chest X-ray image data for 14 heart and lung diseases provided by the NIH Clinical Center for the chest X-ray image data augmentation experiment. Among them, only 8 representative diseases (atelectasis, cardiac hypertrophy, pleural effusion, infiltration, mass, nodule, pneumonia, and pneumothorax) were used.
본 출원인은 성능비교를 위해 8개의 흉부 대표 질환에 대해 2가지 실험을 진행하였다. The present applicant conducted two experiments for 8 representative diseases of the chest for performance comparison.
첫 번째 실험은 1) 종래의 심층 합성곱 적대적 생성 신경망과 2) 본원의 멀티 스케일 조건부 적대적 생성 신경망으로 고해상도 영상(256x256)을 질환별로 생성한 후 생성자료의 품질을 평가하기 위해 새롭게 제안한 지표인 프레쳇 덴스넷 거리(Fre'chet DenseNet distance, FDD)로 네트워크의 성능을 비교한 것이다(도 4).The first experiment consists of 1) a conventional deep convolutional adversarial generating neural network and 2) generating high-resolution images (256x256) for each disease with our multi-scale conditional adversarial generating neural network. The performance of the network is compared with the Fre'chet DenseNet distance (FDD) (FIG. 4).
실험에서 사용한 영상의 개수는 대략 10만장으로 batch size를 16으로 하였을 때 한 에폭당 7,000번의 반복이 이루어진다. 도 3의 그래프를 보면 판별자는 5 에폭 이후 점차 안정화 되었고 생성자는 20 에폭(반복 140,000) 부근에서 함수 변동(fluctuation)이 안정화 되었다. 그 이상 학습을 진행했을 시 두 모델이 균형을 잃어 모두 붕괴(mode collapse)가 발생하여 학습을 중단하였다.The number of images used in the experiment is about 100,000, and when the batch size is 16, 7,000 repetitions per epoch are made. Referring to the graph of FIG. 3 , the discriminator was gradually stabilized after 5 epochs, and the function fluctuation was stabilized in the generator around 20 epochs (repeat 140,000). When the training was conducted beyond that, both models lost their balance and both models collapsed and the learning was stopped.
프레쳇 인셉션 거리(Fre'chet Inception Distance, FID)는 두 정규 분포의 차이를 측정한 것으로 인셉션 점수(Inception Score, IS)가 실제 자료의 분포를 사용하지 않는 단점을 보완하기 위해서 제안되었다. FID는 수학식 3과 같이 계산되는데 값이 작을수록 좋은 품질을 의미한다. 여기서, (m, C)와 (m w, C w) 는 생성자료와 실제 자료의 평균과 공분산이다.The Fre'chet Inception Distance (FID) is a measure of the difference between two normal distributions, and it was proposed to compensate for the disadvantage that the Inception Score (IS) does not use the distribution of actual data. The FID is calculated as in Equation 3, and a smaller value means better quality. where (m, C) and (m w , C w ) is the mean and covariance of the generated data and the actual data.
하지만, 1,000개의 클래스와 120만개로 구성된 자연 이미지인 ImageNet을 사전 학습한 인셉션 모델을 기반으로 각 정규분포를 구하게 되면 분포를 구할 때 사용되는 이미지가 사전 학습된 모델을 통해 1,000개의 클래스에 속할 확률 벡터를 출력한다는 점에서 문제가 된다. 그 이유는 인셉션 모델이 자연 이미지를 기반으로 학습되었기 때문에 흉부 X선과 같은 의료 영상을 클래스로 포함하지 않기 때문이다. 즉, 흉부 X선 영상은 자연 이미지가 아니기 때문에 이렇게 특화된 특징을 해석하기에 적합하지 않다. 따라서 의료 영상의 분포를 구하고 분포간 거리를 통해 품질을 평가하기 위해서는 의료 도메인에 최적화되어진 모델이 필요하다. 이에, 의료영상을 학습데이터로 하여 DenseNet을 학습시킨 후 이 모델을 기반으로 두 의료 영상의 정규분포를 구해 거리를 측정하는 프레쳇 덴스넷 거리(Fre'chet DenseNet Distance, FDD)를 새로운 지표로 제안하여 실험을 수행하였다.However, if each normal distribution is obtained based on the inception model pre-trained on ImageNet, which is a natural image consisting of 1,000 classes and 1.2 million, the probability that the image used to find the distribution belongs to 1,000 classes through the pre-trained model. This is problematic in that it outputs a vector. The reason is that because the inception model is trained based on natural images, it does not include medical images such as chest X-rays as a class. In other words, the chest X-ray image is not a natural image, so it is not suitable for interpreting these specialized features. Therefore, in order to obtain the distribution of medical images and to evaluate the quality through the distance between distributions, a model optimized for the medical domain is needed. Therefore, after training DenseNet using medical images as training data, Fre'chet DenseNet Distance (FDD), which measures the distance by finding the normal distribution of two medical images based on this model, is proposed as a new index. to conduct the experiment.
우선, 첫 번째 실험을 진행하기 위해 학습데이터 78,468개, 검증데이터 11,219개로 DenseNet-121을 학습하였다. 그리고 학습된 모델을 기반으로 실제 영상과 cDCGAN 그리고 실제 영상과 제안한 모델에 대한 FDD값을 구하여 모델의 성능을 정량적으로 평가하였다. 여기서 실제 영상은 DenseNet-121 학습에 사용되지 않은 나머지 부분을 의미한다. 하지만 의료영상 특성상 데이터 불균형이 심해 각 클래스마다 영상 데이터의 개수가 고르지 못하다. 따라서 클래스 별 실제영상과 생성된 영상의 개수는 Atelectasis 1,000개, Cardiomegaly 575개, Effusion 1,000개, Infiltration 1,000개, Mass 729개, Nodule 774개, Pneumonia 242개, Pneu- mothorax 539개로 총 5,859개의 영상을 동일하게 맞추어 테스트를 수행하였다.First, for the first experiment, DenseNet-121 was trained with 78,468 training data and 11,219 validation data. And based on the learned model, the performance of the model was quantitatively evaluated by obtaining the FDD values for the real image, cDCGAN, and the real image and the proposed model. Here, the actual video means the remaining part that is not used for DenseNet-121 training. However, due to the high data imbalance due to the characteristics of medical imaging, the number of image data for each class is uneven. Therefore, the number of actual images and generated images by class is 1,000 attelectasis, 575 cardiomegaly, 1,000 effusion, 1,000 infiltration, 729 mass, 774 nodule, 242 pneumonia, and 539 pneumothorax, for a total of 5,859 images. The test was performed in the same way.
그 결과 도 4에서 보이듯이 실제 영상과 본원의 멀티 스케일 조건부 적대적 생성 신경망을 통해 생성한 영상의 거리를 계산한 FDD값이 실제 영상과 종래의 심층 합성곱 적대적 생성 신경망인 cDCGAN에서 생성한 영상의 거리를 계산한 FDD값보다 대체로 낮으므로 본원의 멀티 스케일 조건부 적대적 생성 신경망이 더 우수하다는 것을 확인할 수 있다.As a result, as shown in FIG. 4, the FDD value calculated by calculating the distance between the real image and the image generated through the multi-scale conditional adversarial generating neural network of the present application is the distance between the real image and the image generated by cDCGAN, a conventional deep convolutional adversarial generating neural network. Since it is generally lower than the calculated FDD value, it can be confirmed that the multi-scale conditional adversarial generating neural network of the present application is superior.
두 번째 실험은 종래의 적대적 생성 신경망인 DCGAN과 본원의 멀티 스케일 조건부 적대적 생성 신경망을 통해 나온 실험결과를 실제 영상과 정성적 비교한 것으로, 본원의 멀티 스케일 조건부 적대적 생성 신경망의 결과에 대해서는 질환별로 질환위치를 찾아 표시하였다.The second experiment is a qualitative comparison of the experimental results from the conventional adversarial generative neural network, DCGAN, and our multi-scale conditionally adversarial neural network, with actual images. The location was found and marked.
도 5는 실제 데이터와 종래의 심층 합성곱 적대적 생성 신경망의 결과, 본원의 멀티 스케일 조건부 적대적 생성 신경망의 결과를 나타낸 것이다. 해당 질환에 대한 구체적인 구별방법은 다음과 같다.FIG. 5 shows the results of real data and the conventional deep convolutional adversarial neural network, and the multi-scale conditional adversarial neural network of the present application. The specific method for distinguishing the disease is as follows.
무기폐(Atelectasis)는 폐 전체 혹은 일부의 공기 감소를 일으키며, 일반 적으로 폐 용적 감소를 동반하고 호흡길(기도)이 치우치거나 폐의 백색화가 일어난다. 심장 비대(Cardiomegaly)는 심실벽(근육)이 두꺼워져 심근의 무게가 증가한 상태로 흉곽음영의 내부 길이에 비하여 심장음영의 길이가 절반이상 차지한 경우로 정의한다. 흉수(Effusion)는 흉막강 내 이상으로 고인 액체로 좌우 흉강의 불균형한 음영증가가 보인다. 침윤(Infiltration)은 정상 조직에 염증세포가 모여 있는 모양으로 폐포성 음영증가를 보이며 폐 주변부위에 잘 나타난다. 종괴(Mass)는 직경 3cm이상의 큰 덩어리 모양으로 증가된 음영. 폐, 흉막, 종격(동), 흉벽 등 흉부 모든 곳에서 기술 가능하다. 결절(Nodule)은 3cm이하의 경계가 그려지는 둥근 폐 음영이다. 폐렴(Pneumonia)은 폐에 염증이 일어나는 반응으로 주로 세균의 감염을 통해 일어나고 영상 내 그물모양이나 벌집 모양 같은 음영이 발견된다. 마지막으로, 기흉(Pneumothorax)은 흉막강 내에 구멍이 생겨 공기가 고이는 상태로 큰 공기주머니가 보인다.Atelectasis causes air loss of all or part of the lungs, usually accompanied by a decrease in lung volume, biased respiratory tract (airway), or whitening of the lungs. Cardiomegaly is defined as a condition in which the ventricular wall (muscle) thickens and the weight of the myocardium increases, and the length of the cardiac shadow occupies more than half of the internal length of the thoracic shadow. Effusion is a stagnant fluid in the pleural cavity, which shows an unbalanced increase in shading in the left and right thoracic cavities. Infiltration is a form of inflammatory cells gathered in normal tissue, and it shows an increase in alveolar shade and is well seen around the lungs. A mass is a large mass with a diameter of 3 cm or more and increased shading. It can be described anywhere in the chest, including the lungs, pleura, mediastinum (sinus), and chest wall. A nodule is a round lung shade with a border of 3 cm or less. Pneumonia is an inflammatory reaction in the lungs, mainly caused by bacterial infection, and shadows such as mesh or honeycomb are found in the image. Finally, in pneumothorax, a hole in the pleural cavity is formed and a large air sac is visible.
위 질환 설명을 기반으로 종래의 심층 합성곱 적대적 생성 신경망의 결과들은 실제 데이터와 비교해 영상의 해상도가 확연히 떨어짐을 확인할 수 있다. 즉, 늑골의 모양이나 폐 기관지 모습이 선명하게 보이지 않아 주요 특징들이 제대로 학습되었다고 보기 어렵다. 반면, 본원의 멀티 스케일 조건부 적대적 생성 신경망의 결과는 실제 영상과 구별되지 않을 정도의 영상특징과 해상도를 보였으며, 각 질환별 특징도 실제 영상과 큰 차이가 없을 정도로 잘 학습하였다. 한 예로 심장 비대의 경우 다른 질환과 비교하였을 때 심장의 모습이 확연히 커진 것을 볼 수 있는데, 이는 '심장 크기 증가'라는 해당 질환의 특징을 잘 학습한 결과이다.Based on the above disease description, it can be confirmed that the results of the conventional deep convolutional adversarial neural network have significantly lowered image resolution compared to the actual data. That is, the shape of the ribs or the shape of the lung bronchus is not clearly visible, so it is difficult to say that the main features have been properly learned. On the other hand, the results of our multi-scale conditional adversarial neural network showed image features and resolution that were indistinguishable from the actual images, and the characteristics of each disease were also learned so well that there was no significant difference from the actual images. For example, in the case of cardiac hypertrophy, compared to other diseases, the shape of the heart is significantly increased, which is a result of well learning the characteristic of the disease, 'increase in heart size'.
도 6은 본 발명의 일 실시예에 따른 흉부 X선 영상 생성 방법을 나타낸 흐름도이다. 도 6에 도시된 일 실시예에 따른 흉부 X선 영상 생성 방법은 도 1에 도시된 흉부 X선 영상 생성 장치(1)에서 시계열적으로 처리되는 단계들을 포함한다. 따라서, 이하 생략된 내용이라고 하더라도 도 6에 도시된 일 실시예에 따라 수행되는 흉부 X선 영상 생성 방법에도 적용된다.6 is a flowchart illustrating a method for generating a chest X-ray image according to an embodiment of the present invention. The method for generating a chest X-ray image according to an embodiment illustrated in FIG. 6 includes operations that are time-series processed by the
도 6을 참조하면, 단계 S600에서 생성자 및 판별자를 포함하는 멀티 스케일 조건부 적대적 생성 신경망을 생성할 수 있다. 여기서, 생성자는 잠재변수 및 특정 질환의 특징에 대한 조건변수를 입력받고 각 분기마다 업샘플링(up-sampling)하여 각각 해상도가 다른 특정 질환을 나타내는 복수의 흉부 X선 영상을 생성할 수 있다. 또한, 판별자는 복수의 흉부 X선 영상의 진위 여부를 구별할 수 있다.Referring to FIG. 6 , in step S600 , a multi-scale conditional adversarial generation neural network including a generator and a discriminator may be generated. Here, the generator may generate a plurality of chest X-ray images representing specific diseases having different resolutions by receiving the latent variable and the condition variable for the characteristic of a specific disease and performing up-sampling for each branch. In addition, the discriminator may discriminate whether the plurality of chest X-ray images are authentic or not.
단계 S610에서 멀티 스케일 조건부 적대적 생성 신경망을 통해 흉부 X선 영상을 생성할 수 있다. 예를 들어, 멀티 스케일 조건부 적대적 생성 신경망에 잠재변수 및 특정 질환의 특징에 대한 조건변수를 입력하여 기설정된 값 이상의 해상도를 갖고 특정 질환을 나타내는 흉부 X선 영상을 생성할 수 있다.In operation S610, a chest X-ray image may be generated through the multi-scale conditional adversarial generation neural network. For example, by inputting a latent variable and a condition variable for a characteristic of a specific disease into a multi-scale conditional adversarial generating neural network, a chest X-ray image representing a specific disease with a resolution greater than or equal to a preset value may be generated.
도 6을 통해 설명된 흉부 X선 영상 생성 방법은 매체에 저장된 컴퓨터 프로그램의 형태로 구현되거나, 컴퓨터에 의해 실행되는 프로그램 모듈과 같은 컴퓨터에 의해 실행 가능한 명령어를 포함하는 기록 매체의 형태로도 구현될 수 있다. 컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다. 또한, 컴퓨터 판독가능 매체는 컴퓨터 저장 매체를 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함한다. The chest X-ray image generating method described with reference to FIG. 6 may be implemented in the form of a computer program stored in a medium, or in the form of a recording medium including instructions executable by a computer, such as a program module executed by a computer. can Computer-readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media. Also, computer-readable media may include computer storage media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
전술한 본 발명의 설명은 예시를 위한 것이며, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다. The foregoing description of the present invention is for illustration, and those of ordinary skill in the art to which the present invention pertains can understand that it can be easily modified into other specific forms without changing the technical spirit or essential features of the present invention. will be. Therefore, it should be understood that the embodiments described above are illustrative in all respects and not restrictive. For example, each component described as a single type may be implemented in a distributed manner, and likewise components described as distributed may also be implemented in a combined form.
본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다. The scope of the present invention is indicated by the following claims rather than the above detailed description, and all changes or modifications derived from the meaning and scope of the claims and their equivalent concepts should be interpreted as being included in the scope of the present invention. do.
Claims (18)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR20210021983 | 2021-02-18 | ||
| KR10-2021-0021983 | 2021-02-18 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022177044A1 true WO2022177044A1 (en) | 2022-08-25 |
Family
ID=82932002
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/KR2021/002628 Ceased WO2022177044A1 (en) | 2021-02-18 | 2021-03-03 | Apparatus and method for generating high-resolution chest x-ray image by using attention-mechanism-based multi-scale conditional generative adversarial neural network |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2022177044A1 (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115423734A (en) * | 2022-11-02 | 2022-12-02 | 国网浙江省电力有限公司金华供电公司 | Infrared and visible light image fusion method based on multi-scale attention mechanism |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20180117009A (en) * | 2017-04-18 | 2018-10-26 | 황성관 | Artificial Intelligence Method for Augumenting Ultrasound Image |
| KR20190143708A (en) * | 2018-06-21 | 2019-12-31 | 한국외국어대학교 연구산학협력단 | Apparatus and Method for Bone Suppression from X-ray Image based on Deep Learning |
| KR20200058295A (en) * | 2018-11-19 | 2020-05-27 | 고려대학교 산학협력단 | Method and Device of High Magnetic Field Magnetic Resonance Image Synthesis |
| KR20200101540A (en) * | 2019-02-01 | 2020-08-28 | 장현재 | Smart skin disease discrimination platform system constituting API engine for discrimination of skin disease using artificial intelligence deep run based on skin image |
-
2021
- 2021-03-03 WO PCT/KR2021/002628 patent/WO2022177044A1/en not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20180117009A (en) * | 2017-04-18 | 2018-10-26 | 황성관 | Artificial Intelligence Method for Augumenting Ultrasound Image |
| KR20190143708A (en) * | 2018-06-21 | 2019-12-31 | 한국외국어대학교 연구산학협력단 | Apparatus and Method for Bone Suppression from X-ray Image based on Deep Learning |
| KR20200058295A (en) * | 2018-11-19 | 2020-05-27 | 고려대학교 산학협력단 | Method and Device of High Magnetic Field Magnetic Resonance Image Synthesis |
| KR20200101540A (en) * | 2019-02-01 | 2020-08-28 | 장현재 | Smart skin disease discrimination platform system constituting API engine for discrimination of skin disease using artificial intelligence deep run based on skin image |
Non-Patent Citations (1)
| Title |
|---|
| ANN KYEONGJIN, YEONGGUL JANG, SEONGMIN HA, BYUNGHWAN JEON, YOUNGTAEK HONG, HACKJOON SHIM, HYUK-JAE CHANG: "Generation of High-Resolution Chest X-rays using Multi-scale Conditional Generative Adversarial Network with Attention", JOURNAL OF BROADCAST ENGINEERING, KOREA, vol. 25, no. 1, 30 January 2020 (2020-01-30), Korea , XP055964187, ISSN: 1226-7953, DOI: 10.5909/JBE.2020.25.1.1 * |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115423734A (en) * | 2022-11-02 | 2022-12-02 | 国网浙江省电力有限公司金华供电公司 | Infrared and visible light image fusion method based on multi-scale attention mechanism |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Uparkar et al. | Vision transformer outperforms deep convolutional neural network-based model in classifying X-ray images | |
| WO2021073380A1 (en) | Method for training image recognition model, and method and apparatus for image recognition | |
| WO2017051943A1 (en) | Method and apparatus for generating image, and image analysis method | |
| WO2019143177A1 (en) | Method for reconstructing series of slice images and apparatus using same | |
| CN116071401B (en) | Virtual CT image generation method and device based on deep learning | |
| WO2022010075A1 (en) | Method for analyzing human tissue on basis of medical image and device thereof | |
| WO2019172498A1 (en) | Computer-aided diagnosis system for providing tumor malignancy and basis of malignancy inference and method therefor | |
| WO2021034138A1 (en) | Dementia evaluation method and apparatus using same | |
| WO2023085910A1 (en) | Image learning method, device, program, and recording medium using generative adversarial network | |
| WO2021137454A1 (en) | Artificial intelligence-based method and system for analyzing user medical information | |
| CN115330615A (en) | Artifact removal model training method, apparatus, equipment, medium and program product | |
| WO2019143021A1 (en) | Method for supporting viewing of images and apparatus using same | |
| WO2025084546A1 (en) | Method and system for quantitatively analyzing brain image based on ct image | |
| CN111461065A (en) | Tubular structure identification method and device, computer equipment and readable storage medium | |
| KR20220149929A (en) | Apparatus and method for generating high-resolution chest x-ray using conditional generative advrsarial network with attention mechanism | |
| WO2022177044A1 (en) | Apparatus and method for generating high-resolution chest x-ray image by using attention-mechanism-based multi-scale conditional generative adversarial neural network | |
| WO2023047963A1 (en) | Medical image diagnostics assistance device, medical image diagnostics assistance method, and program | |
| WO2023132392A1 (en) | Method and system for analyzing blood flow characteristics in carotid artery by means of particle-based simulation | |
| CN113850796A (en) | Lung disease identification method and device, medium and electronic equipment based on CT data | |
| WO2024143757A1 (en) | Electronic device for detecting abnormality of extraocular muscle in orbital ct image, method therefor, and method of training 3d variational autoencoder model for detecting abnormality of extraocular muscle in orbital ct image | |
| CN116978549A (en) | Organ disease prediction method, device, equipment and storage medium | |
| WO2023113500A1 (en) | System and method for predicting depth image for medical image | |
| WO2022025477A1 (en) | Method for training artificial neural network for predicting treatment response to disease, and treatment response prediction apparatus | |
| Raner et al. | Progressive Growing of Generative Adversarial Networks (PGGAN) Approach to Synthesize Medical Images | |
| WO2018159868A1 (en) | Medical image region segmentation method and device therefor |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21926844 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 21926844 Country of ref document: EP Kind code of ref document: A1 |