[go: up one dir, main page]

WO2019132590A1 - Procédé et dispositif de transformation d'image - Google Patents

Procédé et dispositif de transformation d'image Download PDF

Info

Publication number
WO2019132590A1
WO2019132590A1 PCT/KR2018/016863 KR2018016863W WO2019132590A1 WO 2019132590 A1 WO2019132590 A1 WO 2019132590A1 KR 2018016863 W KR2018016863 W KR 2018016863W WO 2019132590 A1 WO2019132590 A1 WO 2019132590A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
input
feature
classification
present disclosure
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/KR2018/016863
Other languages
English (en)
Korean (ko)
Inventor
김원태
강신욱
이명재
김동민
김필수
김종열
노병필
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jlk Inspection Inc
Republic Of Korea (korea Customs Service)
Original Assignee
Jlk Inspection Inc
Republic Of Korea (korea Customs Service)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jlk Inspection Inc, Republic Of Korea (korea Customs Service) filed Critical Jlk Inspection Inc
Publication of WO2019132590A1 publication Critical patent/WO2019132590A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/003Reconstruction from projections, e.g. tomography
    • G06T11/005Specific pre-processing for tomographic reconstruction, e.g. calibration, source positioning, rebinning, scatter correction, retrospective gating
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10116X-ray image
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present disclosure relates to an image conversion method and apparatus. More particularly, this disclosure relates to an image transformation method and apparatus based on image feature classification.
  • the X-ray image is utilized for customs clearance system or enterprise information security system because it can capture hidden product that can not be identified and detected by the image obtained by shooting with a general image pickup device such as CCD camera.
  • the security check through X-ray image is performed by the inspector checking the X-ray image one by one and judging whether or not an inappropriate or dangerous object is included. Therefore, the security test through X-ray image shows a big difference in the test result according to the skill of the examiner who reads the X-ray image.
  • the technical object of the present disclosure is to provide an image conversion method and apparatus.
  • an image conversion method includes an image input step of inputting an image, an object classification step of classifying objects included in the input image, a reference image selection step of selecting a reference image based on the classification result of the object classification step, And an image transformation step of transforming at least one of the input image and the image related to the object to generate a transformed image.
  • an image conversion apparatus includes an image input unit for inputting an image, an object classification unit for classifying objects included in the input image, a reference image selection unit for selecting a reference image based on the classification result of the object classification unit, And an image converting unit converting at least one of the input image and the image related to the object to generate a transformed image.
  • an image conversion method and apparatus can be provided.
  • a method and apparatus for transforming an input image to appear visually similar to an X-ray image can be provided.
  • FIG. 1 is a view for explaining a structure of an image conversion apparatus according to an embodiment of the present disclosure.
  • FIG. 2 is a diagram for explaining the operation of the image conversion apparatus according to an embodiment of the present disclosure.
  • FIG. 3 is a diagram for explaining an image conversion method according to an embodiment of the present disclosure.
  • FIG. 4 is a diagram for explaining context analysis according to an embodiment of the present disclosure.
  • FIG. 5 is a diagram illustrating a process of generating and analyzing context information of an image according to an embodiment of the present disclosure.
  • FIG. 6 is a diagram for explaining a process of analyzing an image and identifying an object according to an image analysis apparatus according to an embodiment of the present disclosure.
  • FIG. 7 is a view for explaining the operation of the image analysis apparatus according to an embodiment of the present disclosure.
  • FIG. 8 is a diagram for explaining an embodiment of a composite-object neural network for generating a multi-channel feature map.
  • FIG. 9 is a diagram for explaining an embodiment of the pulling technique.
  • first, second, etc. are used only for the purpose of distinguishing one element from another, and do not limit the order or importance of elements, etc. unless specifically stated otherwise.
  • a first component in one embodiment may be referred to as a second component in another embodiment, and similarly a second component in one embodiment may be referred to as a first component .
  • the components that are distinguished from each other are intended to clearly illustrate each feature and do not necessarily mean that components are separate. That is, a plurality of components may be integrated into one hardware or software unit, or a single component may be distributed into a plurality of hardware or software units. Thus, unless otherwise noted, such integrated or distributed embodiments are also included within the scope of this disclosure.
  • the components described in the various embodiments are not necessarily essential components, and some may be optional components. Thus, embodiments consisting of a subset of the components described in one embodiment are also included within the scope of the present disclosure. Also, embodiments that include other elements in addition to the elements described in the various embodiments are also included in the scope of the present disclosure.
  • FIG. 1 is a view for explaining a structure of an image conversion apparatus according to an embodiment of the present disclosure.
  • the image conversion apparatus 100 may receive the image 110 and output the converted image 160.
  • the image transforming apparatus 100 may include a data preprocessing unit 120, an object classifying unit 130, a reference image selecting unit 140, and / or an image transforming unit 150. Some of the components included in the image conversion apparatus 100 may be omitted or their functions may be integrated into other components. Or two or more operations described by the operation of one component may be performed in separate components.
  • the data preprocessing unit 120 may perform pre-processing to improve the quality of the input image 110.
  • An image filtering technique capable of processing Fast Fourier Transform (FFT), histogram equalization, motion artifact removal, and / or noise cancellation can be applied for the pre-processing .
  • FFT Fast Fourier Transform
  • a median filter may be applied to remove small-sized noise that may occur during image capture.
  • the quality of the image 110 may be improved by, for example, performing FFT to emphasize a component corresponding to a specific frequency region.
  • the edge component included in the image 110 can be emphasized by performing the FFT to set the weight of the high frequency component to a high value.
  • the data preprocessing unit 120 may divide the objects included in the image 110 into a single object by applying the object region segmentation technique.
  • the object region segmentation technique may be a segmentation technique using an image characteristic of an object, and may include an artificial intelligence based segmentation technique through neural network structure learning.
  • a single object image may be generated for each object included in the image 110.
  • the data preprocessing unit 120 may be omitted. That is, the image conversion apparatus 100 may include an object classification unit 130, a reference image selection unit 140, and an image conversion unit 150.
  • the object classification unit 130 may classify the object type by receiving a single object image. Morphological features of the object can be used for the classification. A machine learning based classification technique may be applied for the classification. For example, objects can be classified using at least one of Support Vector Machine (SVM), AdaBoost, and Convolutional Neural Network (CNN).
  • SVM Support Vector Machine
  • AdaBoost AdaBoost
  • CNN Convolutional Neural Network
  • SVM is one of the fields of machine learning, and is a map learning model for pattern recognition or data analysis.
  • SVM can be used mainly for classification and regression analysis.
  • the SVM algorithm Given a set of data belonging to one of two categories, the SVM algorithm can generate a non-stochastic binary linear classification model that determines, based on a given set of data, which category the new data belongs to.
  • the generated classification model can be represented as a boundary in the space where data is mapped, and the SVM algorithm is an algorithm for finding the boundary having the largest width.
  • AdaBoost is one of the machine learning meta algorithms and can be used in combination with other types of learning algorithms to improve performance.
  • We can express the final result of the classifier by weighting the results of other learning algorithms (weaker learner), and if the performance of individual learners is poor, and each performance is slightly better than the random estimate, The model can be converged into a strong learning machine.
  • the composite product neural network structure may be used to extract " features " such as borders, line colors, etc. from the input data (image) and may include multiple layers. Each layer can receive input data and process the input data of the layer to generate output data.
  • the composite neural network can output the feature map generated by convoluting the input image or the input feature map with filter kernels as output data.
  • the initial layers of the composite product neural network may be operated to extract low level features such as edges or gradients from the input.
  • the next layers of the neural network can extract gradually more complex features such as eyes, nose, and so on.
  • the composite neural network may also include a pooling layer in which a pooling operation is performed in addition to the convolutional layer in which the convolution operation is performed.
  • the pooling technique is a technique used to reduce the spatial size of data in the pooling layer.
  • the pooling technique includes a max pooling technique for selecting a maximum value in a corresponding area and an average pooling technique for selecting an average value of the corresponding area.
  • a max pooling technique is generally used do.
  • the window size and interval (stride) of the pooling are generally set to the same value.
  • the stride means adjusting the interval to move the filter when applying the filter to the input data, i.e., the interval at which the filter moves, and the stride can also be used to adjust the size of the output data.
  • a composite neural network structure can be learned to classify a single object image. Learning about the composite neural network structure can be performed to more clearly learn the morphological characteristics of the objects included in the image.
  • the reference image selection unit 140 may select one reference image from one or more reference images stored in a reference image storage unit (not shown) based on the classification result of the object classification unit 130.
  • the reference image may be an X-ray image.
  • the reference image may be a raw image taken by an X-ray imaging device or an image in any format for storing or transmitting the image.
  • the image may be obtained by capturing image data captured by an X-ray reading device and transmitting the image data to an output device such as a monitor and then data.
  • the reference images stored in the reference image storage unit may be categorized or stored as a database.
  • the reference image selector 140 can select a reference image associated with the object by categorizing the classification result of the object.
  • the image converting unit 150 may generate a transformed image using the selected reference image and the single object image.
  • the morphological feature of the object can be incrementally updated to follow the X-ray reference image.
  • an optimization function can be designed.
  • the function can be designed as a weighted sum of terms expressing the shape of an object in a single object image and a feature expressing a characteristic of the X-ray reference image.
  • the term can be designed to minimize the loss in order to minimize the difference in the response of the composite neural network filter between the single object image and the final transformed image.
  • the term designed to minimize loss can be regarded as a term designed to restore the image as well as possible.
  • terms can be designed with a loss that minimizes the correlation between filters of each layer of the composite neural network.
  • the term that expresses the features of the X-ray reference image thus designed can be regarded as a term that can express the overall characteristics of the X-ray image rather than the detailed features of the image.
  • the final loss function can be designed using the weighted sum of the two losses, and the minimized image can be obtained as the final transformed image.
  • FIG. 2 is a diagram for explaining the operation of the image conversion apparatus according to an embodiment of the present disclosure.
  • the input image 210 may include as an object, for example, a vitamin bottle and a toothpaste tube.
  • the data preprocessing unit 120 of the image conversion apparatus of the present disclosure can generate a single object image 220, 260 by performing data preprocessing on the input image 210.
  • the object classification unit 130 may classify the objects included in the single object images 220 and 260 and output classification results such as "Vitamin” 230 and "Toothpaste" 270 for each object .
  • the reference image selector 140 can select the reference images 240 and 280 from the reference image storage unit based on the classification results 230 and 270 of the object classification unit 130.
  • the reference images 240 and 280 may be X-ray images.
  • the image converting unit 150 may generate the converted images 250 and 290 by performing image conversion using the single object images 220 and 260 and the reference images 240 and 280.
  • the transformed images 250 and 290 may be images obtained by transforming the single object images 220 and 260 so as to be visually similar to the X-ray images.
  • FIG. 3 is a diagram for explaining an image conversion method according to an embodiment of the present disclosure.
  • step S310 an image to be subjected to image conversion can be input.
  • the input image may undergo a data preprocessing process in step S320.
  • preprocessing can be performed to improve the quality of the image.
  • the objects included in the input image can be separated into a single object.
  • step S320 a single object image including a single object may be generated. If the input image in step S310 is a single object image, the data preprocessing process in step S320 may not be performed.
  • step S330 the types of objects included in the single object image can be classified. For classification, a machine learning based classification technique using the morphological characteristics of the object can be applied.
  • the reference image can be selected based on the classification result of step S330.
  • a transformed image can be generated by transforming a single object image using the selected reference image.
  • a deep running-based model may be a fully convolutional neural network, a convolutional neural network, a recurrent neural network, but is not limited to, at least one of a neural network, a restricted Boltzmann machine (RBM), and a deep belief neural network (DBN).
  • a machine running method other than deep running may be included.
  • a hybrid model combining deep running and machine running For example, a feature of an image may be extracted by applying a deep learning-based model, and a model based on a machine learning may be applied when an image is classified or recognized based on the extracted feature.
  • the machine learning based model may include, but is not limited to, a support vector machine (SVM), an AdaBoost, and the like.
  • the object classifier 130 may perform context analysis to classify object types.
  • FIG. 4 One embodiment of a context analysis method performed by the object classifier 130 will be described with reference to FIGS. 4 to 9.
  • FIG. 4 One embodiment of a context analysis method performed by the object classifier 130 will be described with reference to FIGS. 4 to 9.
  • the image analysis apparatus 400 of FIG. 4 may be an embodiment of the object classification unit 130 of FIG.
  • the image analyzing apparatus 400 of FIG. 4 may be included in the object classifying unit 130 of FIG. 1, or may be a device separately configured to perform context analysis.
  • the image analysis apparatus 400 may include a feature extraction unit 410, a context generation unit 420, and / or a feature and context analysis unit 430. It should be noted, however, that only the components necessary for explaining the present embodiment are shown, and the components included in the image analysis apparatus 400 are not limited to the above-described examples.
  • the image analysis apparatus 400 extracts the characteristics of the input image (analysis target image), generates context information based on the extracted features, and analyzes the analysis target image based on the extracted features and the generated context information have. For example, the image analysis apparatus 400 can classify the image or locate the object of interest using the extracted features and the generated context information.
  • the input image of the image analysis apparatus 400 may be the same as the input image of the object classification unit 130 of FIG.
  • the feature extracting unit 410 may extract an image feature by analyzing the input image.
  • the feature may be a local feature for each region of the image.
  • the feature extraction unit 410 may extract characteristics of an input image using a general convolutional neural network (CNN) technique or a pooling technique.
  • the pooling scheme may include at least one of a max pooling scheme and an average pooling scheme.
  • the pulling technique referred to in the present disclosure is not limited to the max-pulling technique or the average-pulling technique, and includes any technique of obtaining a representative value of an image area of a predetermined size.
  • the representative value used in the pooling technique may be at least one of a variance value, a standard deviation value, a mean value, a most frequent value, a minimum value, a weighted average value, etc., in addition to the maximum value and the average value.
  • the composite neural network of the present disclosure may be used to extract "features" such as borders, line colors, etc. from input data (images) and may include multiple layers. Each layer can receive input data and process the input data of the layer to generate output data. The composite neural network can output the feature map generated by convoluting the input image or the input feature map with filter kernels as output data. The initial layers of the composite product neural network may be operated to extract low level features such as edges or gradients from the input. The next layers of the neural network can extract gradually more complex features such as eyes, nose, and so on. The concrete operation of the composite neural network will be described later with reference to FIG.
  • the composite neural network may also include a pooling layer in which a pooling operation is performed in addition to the convolutional layer in which the convolution operation is performed.
  • the pooling technique is a technique used to reduce the spatial size of data in the pooling layer.
  • the pooling technique includes a max pooling technique for selecting a maximum value in a corresponding area and an average pooling technique for selecting an average value of the corresponding area.
  • a max pooling technique is generally used do.
  • the window size and interval (stride) of the pooling are generally set to the same value.
  • the stride means adjusting the interval to move the filter when applying the filter to the input data, i.e., the interval at which the filter moves, and the stride can also be used to adjust the size of the output data.
  • the specific operation of the pulling technique will be described later with reference to FIG.
  • the feature extraction unit 410 is a pre-processing for extracting a feature of an analysis object image and can apply filtering to the analysis object image.
  • the filtering may be Fast Fourier Transform (FFT), histogram equalization, motion artifact rejection, or noise rejection.
  • FFT Fast Fourier Transform
  • the filtering of the present disclosure is not limited to the methods listed above, and may include any type of filtering that can improve the quality of the image.
  • image enhancement may be performed as a pre-processing.
  • the context generation unit 420 may generate context information of the input image (analysis target image) using the characteristics of the input image extracted from the feature extraction unit 410.
  • the context information may be a representative value indicating all or a part of the region to be analyzed.
  • the context information may be global context information of the input image.
  • the context generation unit 420 according to an embodiment can generate context information by applying the resultant synthesis neural network technique or the pulling technique to the features extracted from the feature extraction unit 410.
  • the pooling technique may be, for example, an average pooling technique.
  • the feature and context analyzer 430 may analyze the image based on the features extracted by the feature extractor 410 and the context information generated by the context generator 420.
  • the feature and context analyzing unit 430 according to an embodiment concatenates the local features of each region of the image extracted by the feature extracting unit 410 and the global context reconstructed in the context generating unit 420, Or the like to classify an input image or use it to find a position of an object of interest included in an input image. Since the information at the specific two-dimensional position in the input image includes not only the local feature information but also the global context information, the feature and context analyzing unit 430 uses these pieces of information so that the local feature information Can more accurately recognize or classify similar input images.
  • the invention according to one embodiment of the present disclosure allows for more accurate and efficient learning and image analysis by using global context information as well as local features used by a general artificial neural network technique do.
  • the neural network to which the invention according to the present disclosure is applied can be referred to as a 'depth neural network through context analysis'.
  • FIG. 5 is a diagram illustrating a process of generating and analyzing context information of an image according to an embodiment of the present disclosure.
  • the feature extraction unit 510, the context generation unit 520, and the feature and context analysis unit 530 shown in FIG. 5 respectively correspond to the feature extraction unit 410, the context generation unit 420, Section 430 of FIG.
  • the feature extraction unit 510 may extract a feature from the input image 512 using the input image 512 and generate a feature image 514 including the extracted feature information.
  • the extracted feature may be a feature of the local region of the input image.
  • the input image 512 may include a feature map in each layer in the input image of the image analyzing apparatus or the composite neural network model.
  • the feature image 514 may include a feature map and / or a feature vector obtained by applying a composite neural network technique and / or a pulling technique to the input image 512.
  • the context generation unit 520 may generate context information by applying a composite neural network technique and / or a pulling technique to the feature image 514 extracted by the feature extraction unit 510.
  • the context generation unit 520 may generate context information of various scales such as the entire image, the quadrant area, the 9th quadrant area, and the like by variously adjusting the stride of the pooling. Referring to FIG.
  • the feature and context analyzer 530 can more accurately analyze a specific region of the analysis target image using both the feature image 514 and the context information images 522, 524, and 526.
  • a characteristic image 514 including a local feature extracted by the feature extraction unit 510 can not accurately determine whether it is a car or a boat. That is, although the feature extraction unit 510 can recognize the shape of the object based on local features, there are cases where it is not possible to correctly identify and classify the object only with the shape of the object.
  • the context generating unit 520 generates context information 522, 524, and 526 based on the analysis target image or the feature image 514 to more accurately identify and classify the objects . For example, if the feature extracted for the whole image is recognized or classified as "natural landscape ", the feature extracted for the quadrant image is recognized or classified as” Quot; natural scenery ", “lake “, and” water "as the context information.
  • the feature and context analyzer 530 can identify the object having the shape of the boat or car as a "boat" by utilizing the context information.
  • the context information for the entire image, the context information for the quadruple image, and the context information for the nine-divided image are generated and utilized.
  • context information for an image having a size other than the image of the above-described size may be generated and utilized.
  • FIG. 8 The resultant artificial neural network technique and pulling according to an embodiment of the present disclosure will be described later with reference to FIGS. 8 and 9.
  • FIG. 8 The resultant artificial neural network technique and pulling according to an embodiment of the present disclosure will be described later with reference to FIGS. 8 and 9.
  • FIG. 6 is a diagram for explaining a process of analyzing an image and identifying an object according to an image analysis apparatus according to an embodiment of the present disclosure.
  • the image analysis apparatus 600 receives the image 610 and generates information about image regions of various sizes, thereby accurately identifying and / or classifying the objects included in the image 610.
  • the input image 610 may be, for example, an X-ray image including a bag.
  • the image analyzing apparatus 600 analyzes the input image 610 according to the above description, extracts features of the entire image, features of a partial region of the image, and accurately identifies the objects included in the image 610 can do.
  • the feature 622 for the whole image may be, for example, a feature of the shape of the bag.
  • the features for some areas of the image may include, for example, a feature 624 for a handle, a feature 626 for a zipper, a feature 628 for a ring, and the like.
  • the image analysis apparatus 600 can accurately identify that the object included in the image 610 is a "bag” by utilizing the generated features 622, 624, 626, and 628 as context information.
  • the image analysis apparatus 600 may determine that the object included in the image 610 can not be identified as a " bag " Can not be identified as " bag ".
  • an abnormality of the object can be output. For example, when an unstructured space that is not related to a normal characteristic of the "bag ", a space of a certain thickness or more is detected, the" bag "
  • context information that is not related to the normal context information when context information that is not related to the normal context information is included, such fact can be output to the readout source, and the readout source can perform close inspection or refinement inspection Can be performed.
  • FIG. 7 is a view for explaining the operation of the image analysis apparatus according to an embodiment of the present disclosure.
  • step S700 the image analysis apparatus can extract the characteristics of the image to be analyzed.
  • the image analysis apparatus can extract characteristics of an input image by using a general artificial neural network technique or a pulling technique.
  • the feature of the analysis object image may be a local feature for each region of the image, and the pooling technique may include at least one of a max pooling technique and an average pooling technique.
  • step S710 the image analysis apparatus may generate context information based on the features extracted in step S700.
  • the image analysis apparatus may generate context information by applying the composite neural network technique and / or the pulling technique to the features extracted in step S700.
  • the context information may be a representative value indicating all or a part of the region of the image to be analyzed.
  • the context information may be global context information of the input image.
  • the pooling technique may be, for example, an average pooling technique.
  • step S720 the image analysis apparatus may analyze the analysis target image based on the feature extracted in step S700 and the context information generated in step S710.
  • the image analyzing apparatus classifies the input image by combining the local features of each region of the image extracted in step S700 and the global context reconstructed in step S710, or locates the object of interest included in the input image have. Therefore, since the information at a specific two-dimensional position in the input image is included from the local information to the global context, it is possible to more accurately recognize or classify input images which are different in actual contents but local information is similar to each other. Or context information that is not related to other context information.
  • FIG. 8 is a diagram for explaining an embodiment of a composite-object neural network for generating a multi-channel feature map.
  • the image processing based on the composite neural network can be applied to various fields.
  • image processing apparatuses for image object recognition image processing apparatuses for image reconstruction, image processing apparatuses for semantic segmentation, image processing for scene recognition, Device or the like.
  • the input image 810 may be processed through the composite neural network 800 to output the feature map image.
  • the outputted feature map image can be utilized in various fields as described above.
  • the composite neural network 800 may be processed through a plurality of layers 820, 830 and 840, and each layer may output multi-channel feature map images 825 and 835.
  • a plurality of layers 820, 830, and 840 may extract a feature of an image by applying a filter having a predetermined size from the upper left end to the lower right end of the input data.
  • the plurality of layers 820, 830, and 840 multiply the weighted values by the weighted upper left NxM pixels of the input data and map them to a neuron at the upper left of the feature map.
  • the weight to be multiplied will also be NxM.
  • the NxM may be, for example, 3x3, but is not limited thereto.
  • the plurality of hierarchies 820, 830, and 840 scans input data from left to right and from top to bottom by k squares, and maps the weights to neurons in the feature map.
  • the k-th column means a stride for moving the filter when performing the product multiplication, and can be set appropriately to adjust the size of the output data.
  • k may be one.
  • the NxM weight is called a filter or filter kernel. That is, the process of applying the filter in the plurality of layers 820, 830, and 840 is a process of performing a convolution operation with the filter kernel. As a result, the extracted result is referred to as a "feature map" Map image ".
  • the layer on which the convolution operation is performed may be referred to as a convolution layer.
  • multiple-channel feature map refers to a set of feature maps corresponding to a plurality of channels, and may be, for example, a plurality of image data.
  • Channel feature maps 825 and 835 may be input at any hierarchy and may be output according to a feature map computation result, such as a convolution operation.
  • 830, and 840 also referred to as " hierarchies "or” convolutional layers. &Quot; Each layer sequentially receives multi-channel feature maps generated in the previous layer, Channel characteristic maps generated in the L-1th layer (not shown) in the L-th layer 840.
  • the multi-channel characteristic maps generated in the L-1th layer Maps can be generated.
  • the feature maps 825 having the channel K1 are outputs according to the feature map operation 820 in the layer 1 for the input image 810 and the feature map operation 830 in the layer 2 ≪ / RTI > Feature maps 835 with channel K2 are outputs according to feature map operation 830 at layer 2 for input feature maps 825 and feature map operations (not shown) at layer 3, ≪ / RTI >
  • the multi-channel feature maps 825 generated in the first layer 820 include feature maps corresponding to K1 (K1 is an integer) channels.
  • the multi-channel feature maps 835 generated in the second layer 830 include feature maps corresponding to K2 (K2 is an integer) channels.
  • K1 and K2 which represent the number of channels, may correspond to the number of filter kernels used in the first layer 820 and the second layer 830, respectively. That is, the number of multi-channel feature maps generated in the Mth layer (M is an integer equal to or greater than 1 and equal to or smaller than L-1) may be equal to the number of filter kernels used in the Mth layer.
  • FIG. 9 is a diagram for explaining an embodiment of the pulling technique.
  • the window size of the pulling is 2x2
  • the stride is 2
  • the output image 990 can be generated by applying the maximum pulling to the input image 910.
  • a 2x2 window 910 is applied to the upper left of the input image 910, and a representative value (here, maximum value 4) among the values in the window 910 area is calculated and output images 990 At the corresponding position 920 of the display device.
  • the window is shifted by stride, that is, by 2, and the maximum value 3 of the values in the window 930 region is input to the corresponding position 940 of the output image 990.
  • the above process is repeated from the left side of the input image by a stride. That is, as shown in FIG. 9C, the maximum value 5 of the values in the window 950 region is input to the corresponding position 960 of the output image 990.
  • the window is moved by stride, and the maximum value 2 of the values in the window 970 region is input to the corresponding position 980 of the output image 990.
  • the exemplary methods of this disclosure are represented by a series of acts for clarity of explanation, they are not intended to limit the order in which the steps are performed, and if necessary, each step may be performed simultaneously or in a different order.
  • the illustrative steps may additionally include other steps, include the remaining steps except for some steps, or may include additional steps other than some steps.
  • various embodiments of the present disclosure may be implemented by hardware, firmware, software, or a combination thereof.
  • one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays A general processor, a controller, a microcontroller, a microprocessor, and the like.
  • the present invention can be used to transform images.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé et un dispositif destinés à transformer une image en utilisant une image de référence sélectionnée sur la base de résultats de classification d'objets inclus dans l'image. Le procédé de transformation d'une image, selon un mode de réalisation de la présente invention, peut comporter: une étape d'entrée d'image visant à recevoir une image; une étape de classification d'objets visant à classifier des objets inclus dans l'image d'entrée; une étape de sélection d'image de référence visant à sélectionner une image de référence sur la base des résultats de classification obtenus lors de l'étape de classification d'objets; et une étape de transformation d'image visant à générer une image transformée en transformant, à l'aide de l'image de référence sélectionnée, au moins une image parmi l'image d'entrée et une image associée aux objets.
PCT/KR2018/016863 2017-12-29 2018-12-28 Procédé et dispositif de transformation d'image Ceased WO2019132590A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020170183859A KR101890538B1 (ko) 2017-12-29 2017-12-29 영상 변환 방법 및 장치
KR10-2017-0183859 2017-12-29

Publications (1)

Publication Number Publication Date
WO2019132590A1 true WO2019132590A1 (fr) 2019-07-04

Family

ID=63453729

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2018/016863 Ceased WO2019132590A1 (fr) 2017-12-29 2018-12-28 Procédé et dispositif de transformation d'image

Country Status (2)

Country Link
KR (1) KR101890538B1 (fr)
WO (1) WO2019132590A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210241463A1 (en) * 2020-02-03 2021-08-05 Korea Institute Of Science And Technology Method for supporting x-ray image reading using image transform model and system performing the same

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200109963A1 (en) * 2018-12-06 2020-04-09 Ron Zass Selectively Forgoing Actions Based on Fullness Level of Containers
KR102378742B1 (ko) * 2020-07-30 2022-03-28 한국과학기술연구원 사용자의 x-ray 영상 판독을 지원하는 시스템 및 방법
KR102323602B1 (ko) * 2020-07-31 2021-11-10 (주)제이엘케이 보안영상 출력 장치 및 방법
KR102466156B1 (ko) * 2021-01-19 2022-11-14 주식회사 에이아이웨어 컨벌루셔널 신경망 연산 방법
KR102404166B1 (ko) * 2021-07-20 2022-06-02 국민대학교산학협력단 스타일 전이를 활용한 엑스레이 영상의 유체 탐지 방법 및 장치
KR102565480B1 (ko) * 2021-09-03 2023-08-14 도평건기 주식회사 메신저 대화 카메라 촬영 방지 장치 및 방법

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130066966A (ko) * 2011-12-13 2013-06-21 중앙대학교 산학협력단 주요 밝기 레벨 및 적응적 밝기값 변환에 의한 대비 개선 장치 및 방법
KR20130135315A (ko) * 2011-03-15 2013-12-10 돌비 레버러토리즈 라이쎈싱 코오포레이션 이미지 데이터 변환을 위한 방법 및 장치
KR20140008980A (ko) * 2012-07-05 2014-01-22 연세대학교 산학협력단 다중 레이어 오버랩 블록 기반의 로컬 히스토그램 평활화를 이용한 영상 콘트라스트 향상 방법 및 장치
KR20150111394A (ko) * 2014-03-20 2015-10-06 인하대학교 산학협력단 X-ray 영상을 위한 선명도 및 대비 향상 방법 및 장치
KR20160131848A (ko) * 2015-05-08 2016-11-16 삼성전자주식회사 객체 인식 장치 및 방법

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130135315A (ko) * 2011-03-15 2013-12-10 돌비 레버러토리즈 라이쎈싱 코오포레이션 이미지 데이터 변환을 위한 방법 및 장치
KR20130066966A (ko) * 2011-12-13 2013-06-21 중앙대학교 산학협력단 주요 밝기 레벨 및 적응적 밝기값 변환에 의한 대비 개선 장치 및 방법
KR20140008980A (ko) * 2012-07-05 2014-01-22 연세대학교 산학협력단 다중 레이어 오버랩 블록 기반의 로컬 히스토그램 평활화를 이용한 영상 콘트라스트 향상 방법 및 장치
KR20150111394A (ko) * 2014-03-20 2015-10-06 인하대학교 산학협력단 X-ray 영상을 위한 선명도 및 대비 향상 방법 및 장치
KR20160131848A (ko) * 2015-05-08 2016-11-16 삼성전자주식회사 객체 인식 장치 및 방법

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210241463A1 (en) * 2020-02-03 2021-08-05 Korea Institute Of Science And Technology Method for supporting x-ray image reading using image transform model and system performing the same
US11734831B2 (en) * 2020-02-03 2023-08-22 Korea Institute Of Science And Technology Method for supporting X-RAY image reading using image transform model and system performing the same

Also Published As

Publication number Publication date
KR101890538B1 (ko) 2018-08-30

Similar Documents

Publication Publication Date Title
WO2019132590A1 (fr) Procédé et dispositif de transformation d'image
Shoieb et al. Computer-aided model for skin diagnosis using deep learning
Kalouche et al. Vision-based classification of skin cancer using deep learning
WO2019132589A1 (fr) Dispositif de traitement d'images et procédé de détection d'objets multiples
Şengür et al. White blood cell classification based on shape and deep features
WO2019132588A1 (fr) Dispositif et procédé d'analyse d'image basés sur une caractéristique d'image et un contexte
Hassanat et al. Colour-based lips segmentation method using artificial neural networks
Salama et al. Enhancing medical image quality using neutrosophic fuzzy domain and multi-level enhancement transforms: A comparative study for leukemia detection and classification
WO2019189972A1 (fr) Méthode d'analyse d'image d'iris par l'intelligence artificielle de façon à diagnostiquer la démence
WO2019132592A1 (fr) Dispositif et procédé de traitement d'image
WO2021137454A1 (fr) Procédé et système à base d'intelligence artificielle pour analyser des informations médicales d'utilisateur
Mohseni et al. Can self-training identify suspicious ugly duckling lesions?
CN116503858B (zh) 一种基于生成模型的免疫荧光图像分类方法及系统
Nandy et al. An incremental feature extraction framework for referable diabetic retinopathy detection
Guefrechi et al. Deepfake video detection using InceptionResnetV2
Hamad et al. Improving nuclei classification performance in h&e stained tissue images using fully convolutional regression network and convolutional neural network
WO2023182702A1 (fr) Dispositif et procédé de traitement de données de diagnostic par intelligence artificielle pour des images numériques de pathologie
CN119888817A (zh) 基于剔除初始基础表情状态影响的老年人微表情识别方法、装置、处理器及其存储介质
Oyewole et al. Bridging communication gap among people with hearing impairment: An application of image processing and artificial neural network
Cai et al. Detecting abnormal behavior in examination surveillance video with 3D convolutional neural networks
Asha An enhanced deep learning algorithms for image recognition and plant leaf disease detection
Sadhana et al. Prediction of skin cancer using convolutional neural network
CN111191620B (zh) 一种人-物交互检测数据集的构建方法
Yong et al. Human motion analysis in dark surrounding using line skeleton scalable model and vector angle technique
Nguyen et al. A new method for splitting clumped cells in red blood images

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18895734

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18895734

Country of ref document: EP

Kind code of ref document: A1