WO2023018179A1 - 어플리케이션에 따라 ai 기반으로 영상을 재생하는 전자 장치 및 이에 의한 영상 재생 방법 - Google Patents
어플리케이션에 따라 ai 기반으로 영상을 재생하는 전자 장치 및 이에 의한 영상 재생 방법 Download PDFInfo
- Publication number
- WO2023018179A1 WO2023018179A1 PCT/KR2022/011855 KR2022011855W WO2023018179A1 WO 2023018179 A1 WO2023018179 A1 WO 2023018179A1 KR 2022011855 W KR2022011855 W KR 2022011855W WO 2023018179 A1 WO2023018179 A1 WO 2023018179A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- neural network
- sub
- upscaling
- color component
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4046—Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/7715—Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/96—Management of image or video recognition tasks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/98—Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns
- G06V10/993—Evaluation of the quality of the acquired pattern
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
- H04N21/440263—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the spatial resolution, e.g. for displaying on a connected PDA
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/466—Learning process for intelligent management, e.g. learning user preferences for recommending movies
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/8166—Monomedia components thereof involving executable data, e.g. software
- H04N21/8173—End-user applications, e.g. Web browser, game
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/858—Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/858—Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
- H04N21/8586—Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot by using a URL
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/285—Selection of pattern recognition techniques, e.g. of classifiers in a multi-classifier system
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30168—Image quality inspection
Definitions
- the present disclosure relates to AI-based image reproduction, and more particularly, to AI-based image reproduction according to an application used to acquire an image.
- An electronic device and method capable of effectively improving the quality of an image by processing an image based on AI according to characteristics of an image provided through an application are provided.
- an electronic device and method capable of effectively improving the quality of an image by predicting through which application an image previously stored in the electronic device is provided are provided.
- a neural network structure suitable for the YUV format of an image is provided.
- An electronic device for reproducing an image comprising: a display; and a processor, wherein the processor acquires a first image, identifies an application used to acquire the first image, and includes a neural network corresponding to the identified application among a plurality of neural network setting information. Select setting information, obtain a second image by AI (Artificial Intelligence) upscaling the first image using an upscaling neural network to which the selected neural network setting information is applied, and display the acquired second image can be output through
- AI Artificial Intelligence
- the quality of an image can be effectively improved by processing an image based on AI according to characteristics of an image provided through an application.
- the quality of an image can be effectively improved by predicting through which application an image previously stored in an electronic device is provided.
- 1A is a diagram for explaining an AI decoding process according to an embodiment.
- FIG. 1B is a diagram for explaining the AI upscaling process shown in FIG. 1A.
- FIG. 2 is a block diagram illustrating a configuration of an electronic device according to an exemplary embodiment.
- FIG. 3 is an exemplary diagram illustrating an upscaling neural network for AI upscaling of a first image.
- FIG. 4 is a diagram for explaining a convolution operation by a convolution layer.
- FIG. 5 is an exemplary diagram illustrating a correspondence relationship between various applications and various neural network setting information.
- FIG. 6 is an exemplary diagram illustrating information related to applications required to acquire an image.
- FIG. 7 is an exemplary diagram illustrating an upscaling neural network for AI upscaling of a first image.
- FIG. 8 is a diagram for explaining an operation of a depth to space layer shown in FIG. 7 .
- FIG. 9 is an exemplary diagram illustrating an upscaling neural network for AI upscaling of a first image.
- 10 is a diagram for explaining a method of training a neural network for upscaling.
- FIG. 11 is a diagram for explaining a training process of a neural network for upscaling by a training device.
- FIG. 12 is a flowchart illustrating an AI decoding method by an electronic device according to an embodiment.
- An electronic device for reproducing an image includes a display; and a processor, wherein the processor acquires a first image, identifies an application used to acquire the first image, and includes a neural network corresponding to the identified application among a plurality of neural network setting information. Select setting information, obtain a second image by AI (Artificial Intelligence) upscaling the first image using an upscaling neural network to which the selected neural network setting information is applied, and display the acquired second image can be output through
- AI Artificial Intelligence
- the processor may select the neural network setting information by comparing information about images provided through a plurality of applications with information related to the first image.
- the processor may check an address accessed by a web browser to obtain the first image, compare the checked address with addresses corresponding to a plurality of applications, and select the neural network setting information.
- the neural network for upscaling may be trained based on comparison between an original training image and a second training image upscaled by AI by the neural network for upscaling from a first training image provided through a plurality of applications.
- Neural network setting information corresponding to the first application is obtained through training of the upscaling neural network based on a first training image provided by a first application among the plurality of applications, and second among the plurality of applications Neural network setting information corresponding to the second application may be obtained through training of the upscaling neural network based on a first training image provided by the application.
- the processor determines whether to perform AI upscaling based on a comparison between the resolution of the acquired first image and a predetermined resolution, and based on the decision not to perform the AI upscaling, the first image is displayed as the Output through the display, and based on the determination to perform the AI upscaling, the second image may be output through the display.
- the first image includes a first sub-image of a first color component, a second sub-image of a second color component, and a third sub-image of a third color component, and the size of the first sub-image is The size of the second sub-image is larger than the size of the third sub-image, and the upscaling neural network scales the size of the second sub-image and the size of the third sub-image to be the same as that of the first sub-image.
- a depth to space layer wherein the processor generates a fourth sub-image of a first color component corresponding to a feature map of the first color component and a second color image corresponding to a feature map of the second color component.
- a second image including a fifth sub-image of a component and a sixth sub-image of a third color component corresponding to the feature map of the third color component may be obtained.
- the size of the first sub-image is larger than the size of the second sub-image and the size of the third sub-image by n times (n is a natural number), and the feature map used to generate the feature map of the first color component
- the number of s may be n times greater than the number of feature maps used to generate the feature map of the second color component and the feature map of the third color component.
- the neural network for upscaling further includes a second scaling layer that scales the first sub-image, the second sub-image, and the third sub-image based on a scaling factor of the neural network for upscaling, and wherein the processor ,
- the scaled first sub-image is added to the feature map of the first color component to obtain the fourth sub-image
- the scaled second sub-image is added to the feature map of the second color component to obtain the fifth sub-image.
- the sixth sub-image may be obtained by adding the scaled third sub-image to the feature map of the third color component.
- Sample values of the fourth sub-image, the fifth sub-image, and the sixth sub-image may be clipped to a value within a predetermined range.
- the first image includes a first sub-image of a first color component, a second sub-image of a second color component, and a third sub-image of a third color component
- the upscaling neural network is at least one convolution layer for convolution processing an image; and a depth to space layer generating a feature map of the first color component by combining feature maps output from the at least one convolution layer
- the processor comprises: A fourth sub-image of the first color component corresponding to the feature map of , a fifth sub-image of the second color component scaled from the second sub-image by a scaling factor of the neural network for upscaling, and the third sub-image
- a second image including a sixth sub-image of the third color component scaled by the scaling factor of the neural network for upscaling may be obtained from .
- a method of reproducing an image comprising: acquiring a first image; identifying an application used to acquire the first image; selecting neural network setting information corresponding to the identified application among a plurality of neural network setting information; acquiring a second image by artificial intelligence (AI) upscaling of the first image using an upscaling neural network to which the selected neural network setting information is applied; and providing the acquired second image to a display.
- AI artificial intelligence
- Selecting the neural network setting information may include selecting the neural network setting information based on a comparison between information about images provided through a plurality of applications and information related to the first image.
- the selecting of the neural network setting information may include identifying an address accessed by a web browser to acquire the first image; and selecting the neural network setting information by comparing addresses corresponding to a plurality of applications with the identified address.
- AI artificial intelligence
- the expression "at least one of a, b, or c" means “a”, “b”, “c”, “a and b”, “a and c", “b and c”, or "a, b and c”.
- one component when one component is referred to as “connected” or “connected” to another component, the one component may be directly connected or directly connected to the other component, but in particular Unless otherwise described, it should be understood that they may be connected or connected via another component in the middle.
- components expressed as ' ⁇ unit (unit)', 'module', etc. are two or more components combined into one component, or one component is divided into two or more components for each more subdivided function. may be differentiated into.
- each of the components to be described below may additionally perform some or all of the functions of other components in addition to its own main function, and some of the main functions of each component may be different from other components. Of course, it may be performed exclusively by a component.
- 'image' or 'picture' may refer to a still image, a motion picture composed of a plurality of continuous still images (or frames), or a video.
- sample' may be data (eg, pixels) allocated to a sampling location of an image or feature map.
- a 'provider' is a subject that provides video to an electronic device, and may mean a company, a server operated by the company, a service operated by the company, or a server for providing services operated by the company. there is.
- 'application' refers to a program required to acquire an image from a provider through a network (eg, the Internet).
- the application may be provided to the electronic device from a provider or an external server and installed in the electronic device, or may be installed in the electronic device during manufacturing of the electronic device.
- a 'neural network' is a representative example of an artificial neural network model that mimics a cranial nerve, and is not limited to an artificial neural network model using a specific algorithm.
- a neural network may also be referred to as a deep neural network.
- 'parameter' is a value used in the calculation process of each layer constituting the neural network, and may include, for example, a weight and/or bias used when applying an input value to a predetermined calculation expression.
- Parameters can be expressed in matrix form.
- a parameter is a value set as a result of training and can be updated through separate training data as needed.
- 'neural network setting information' is information related to elements constituting a neural network, and includes the aforementioned parameters.
- a neural network may be set using the neural network setting information.
- Neural network setting information may be referred to as deep neural network setting information.
- 'first image' means an image subject to AI upscaling
- 'second image' means an image generated by AI upscaling
- 'encoding' means processing by a frequency conversion-based image compression method. Also, 'decoding' means processing by a frequency conversion-based image restoration method.
- 1A is a diagram for explaining an AI decoding process according to an embodiment.
- a second image 125 is obtained through an AI decoding process for AI-encoded data provided from the server 10 (or a provider).
- AI-encoded data may include image data and AI data.
- Image data may be provided as a bit stream.
- Image data is data generated as a result of encoding the first image 115, and data obtained based on pixel values in the first image 115, for example, the first image 115 and the first image ( 115) may include residual data that is a difference between the predicted data.
- the image data includes information used in the encoding process of the first image 115 .
- the image data may include prediction mode information used to encode the first image 115, motion information, and quantization parameter-related information used in the encoding process.
- Video data may be generated according to rules of video compression methods using frequency conversion, such as MPEG-2, H.264 AVC, MPEG-4, HEVC, VC-1, VP8, VP9 and AV1, for example, syntax.
- frequency conversion such as MPEG-2, H.264 AVC, MPEG-4, HEVC, VC-1, VP8, VP9 and AV1, for example, syntax.
- the AI data may be used to select neural network setting information used in the AI upscaling process 120 described later. AI data will be described in detail with reference to FIG. 2 . Depending on the embodiment, AI-encoded data may include only image data.
- the decoding process 110 shown in FIG. 1A is described in detail.
- the decoding process 110 includes a process of generating quantized residual data by entropy decoding image data, a process of inverse quantizing the quantized residual data, and a frequency domain.
- a process of converting residual data of components into spatial domain components, a process of generating prediction data, and a process of restoring the first image 115 using the prediction data and residual data may be included.
- Such a decoding process 110 may be implemented through at least one image restoration method using frequency conversion such as MPEG-2, H.264, MPEG-4, HEVC, VC-1, VP8, VP9, and AV1.
- AI upscaling is performed on the first image 115 to obtain a second image 125 having a predetermined resolution and/or a predetermined quality.
- the AI of the AI upscale process 120 may be implemented as a neural network. As will be described later with reference to FIG. 10, since the neural network for upscaling is trained through loss information calculated for each application, the first image 115 can be AI upscaled to a desired quality and/or resolution according to the characteristics of the application. there is.
- FIG. 1B is a diagram for explaining the AI upscaling process 120 shown in FIG. 1A.
- the AI upscaling process 120 which model is suitable for AI upscaling the first image 115 may be determined through analysis 121 of the first image 115 . Specifically, through the analysis 121 of the first image 115, it is confirmed through which application the first image 115 was acquired, and the first image 115 among several pre-stored models 122 It can be determined which model is suitable for AI-based processing 123 for .
- the second image 115 based on the model for 'A' is AI processed 123.
- An image 125 may be acquired.
- the AI upscale process 120 is described in detail below.
- FIG. 2 is a block diagram illustrating a configuration of an electronic device 200 according to an exemplary embodiment.
- an electronic device 200 includes a receiver 210 and an AI decoder 230.
- the AI decoder 230 may include a parser 232, a decoder 234, an AI upscaler 236, and an AI setter 238.
- the electronic device 200 may include various types of devices capable of reproducing images, such as smart phones, tablet PCs, wearable devices, laptop computers, and desktop PCs.
- the receiving unit 210 and the AI decoding unit 230 may be implemented by one processor.
- the receiver 210 and the AI decoder 230 may be implemented as a dedicated processor, and may be implemented with a general-purpose processor such as an application processor (AP), central processing unit (CPU), or graphic processing unit (GPU) and S/W. It may be implemented through a combination of
- a dedicated processor may include a memory for implementing an embodiment of the present disclosure or a memory processing unit for using an external memory.
- the receiving unit 210 and the AI decoding unit 230 may be composed of a plurality of processors. In this case, it may be implemented with a combination of dedicated processors, or it may be implemented with a combination of S/W and a plurality of general-purpose processors such as APs, CPUs, or GPUs.
- the receiving unit 210 is implemented with a first processor
- the parsing unit 232 and the decoding unit 234 are implemented with a second processor different from the first processor
- the setting unit 238 may be implemented with a third processor different from the first and second processors.
- the receiver 210 receives AI-encoded data for the first image 115 .
- the AI-encoded data may include image data generated as a result of encoding the first image 115 and AI data. Depending on the embodiment, AI-encoded data may include only image data.
- the receiver 210 may receive AI-encoded data transmitted from the server 10 (or provider) through a network.
- the receiving unit 210 is a magnetic medium such as a hard disk, a floppy disk and a magnetic tape, an optical recording medium such as a CD-ROM and a DVD, and a magneto-optical medium such as a floptical disk (floptical disk).
- AI-encoded data may be obtained from a data storage medium including a magneto-optical medium) or the like.
- the receiver 210 may receive AI-encoded data from the server 10 through an application, and the AI decoder 230 may obtain the second image 125 by AI-decoding the AI-encoded data.
- the AI-encoded data may be previously received by the receiver 210 and stored in the storage medium of the electronic device 200.
- the AI decoder 230 may acquire the second image 125 by AI-decoding the AI-encoded data stored in the storage medium according to a request of a user or the like.
- the receiving unit 210 outputs the AI-encoded data to the parsing unit 232.
- the parsing unit 232 parses the AI-encoded data and divides it into image data and AI data. For example, the parsing unit 232 may read a header of the data acquired from the receiving unit 210 to determine whether the corresponding data is image data or AI data. In one example, the parsing unit 214 refers to the header of the data received through the receiving unit 210 to distinguish between video data and AI data and transmits them to the decoding unit 234 and the AI setting unit 238, respectively. At this time, the video data included in the AI-encoded data is video data generated through a predetermined codec (eg, MPEG-2, H.264, MPEG-4, HEVC, VC-1, VP8, VP9 or AV1). It can also be verified that In this case, corresponding information may be transmitted to the decoder 234 so that the video data can be processed with the identified codec.
- a predetermined codec eg, MPEG-2, H.264, MPEG-4, HEVC, VC-1, VP8,
- the decoder 234 decodes the image data to obtain the first image 115 .
- the first image 115 obtained by the decoder 234 is provided to the AI upscaler 236.
- the decoder 234 may provide information related to encoding of the first image 115 to the AI setting unit 238 .
- the information related to the encoding of the first image 115 may include a codec type used for encoding the first image 115, prediction mode information, motion information, quantization parameter (QP) information, and the like.
- QP quantization parameter
- the receiving unit 210 may receive the first image 115 itself from the provider. Receiving the first image 115 itself may be understood as receiving pixel values of the first image 115 . In this case, the receiver 210 provides the first image 115 to the AI upscaler 236, and decoding by the decoder 234 is not performed.
- obtaining the first image 115 may mean receiving AI-encoded data (or image data) for the first image 115 or acquiring the first image 115 itself. .
- the AI setting unit 238 may identify an application used (or executed) to obtain the first image 115 and determine an upscale target based on the identified application.
- the upscaling target may indicate, for example, to what resolution and/or quality the first image 115 should be upscaled.
- Determining the upscale target may be understood as a process of selecting neural network setting information corresponding to an application used to acquire the first image 115 from among a plurality of pre-stored neural network setting information.
- the AI setting unit 238 transfers neural network setting information corresponding to the identified application to the AI upscaling unit 236.
- the AI upscaling unit 236 AI upscales the first image 115 through an upscaling neural network to obtain a second image 125 corresponding to an upscale target.
- the AI upscaling unit 236 may determine whether or not to perform AI upscaling on the first image 115 .
- the AI upscaler 236 may compare the resolution of the first image 115 with a predetermined resolution in order to determine the type of image output from the AI decoder 230 .
- the AI upscaler 236 For example, if the resolution of the first image 115 is equal to or less than the first predetermined resolution or greater than or equal to a second predetermined resolution greater than the first resolution, the AI upscaler 236 generates an AI for the first image 115. It is determined that upscaling is not performed, and the first image 115 is output from the AI decoder 230 . When the resolution of the first image 115 is equal to or less than the predetermined first resolution, the AI upscale of the first image 115 is skipped by determining the first image 115 as a thumbnail image, and When the resolution is equal to or greater than the predetermined second resolution, it is determined that the quality of the first image 115 is at a satisfactory level, and AI upscaling of the first image 115 is skipped.
- the AI upscaling unit 236 determines to perform AI upscaling on the first image 115 when the resolution of the first image 115 has a value between the first and second predetermined resolutions. And, the second image 125 obtained as a result of AI up-scaling of the first image 115 is output from the AI decoder 230 .
- the AI setting unit 238 selects the neural network setting information corresponding to the application used to acquire the first image 115.
- a method for the AI setting unit 238 to select the neural network setting information will be described.
- an AI upscaling process through an upscaling neural network will be described with reference to FIGS. 3 and 4 .
- FIG. 3 is an exemplary diagram illustrating an upscaling neural network 300 for AI upscaling of the first image 115
- FIG. 4 is a convolution operation in the first convolution layer 310 shown in FIG. 3 is showing
- the first image 115 is input to the first convolution layer 310 .
- 3X3X4 displayed on the first convolution layer 310 shown in FIG. 3 exemplifies convolution processing on one input image using four filter kernels each having a size of 3x3.
- 4 feature maps are generated by 4 filter kernels.
- Each feature map represents unique characteristics of the first image 115 .
- each feature map may indicate a vertical direction characteristic, a horizontal direction characteristic, or an edge characteristic of the first image 115 .
- one A feature map 430 may be created. Since four filter kernels are used in the first convolution layer 310, four feature maps can be generated through a convolution operation process using the four filter kernels.
- I1 to I49 displayed on the first image 115 represent pixels of the first image 115
- F1 to F9 displayed on the filter kernel 410 represent parameters of the filter kernel 410
- M1 to M9 displayed on the feature map 430 represent samples of the feature map 430 .
- the first image 115 includes 49 pixels, but this is only an example, and when the first image 115 has a resolution of 4K, for example, 3840 X 2160 pixels. may contain pixels.
- pixel values of I1, I2, I3, I8, I9, I10, I15, I16, and I17 of the first image 115 and F1, F2, F3, F4, and F5 of the filter kernel 410 , F6, F7, F8, and F9 are respectively multiplied, and a value obtained by combining (eg, an addition operation) result values of the multiplication operation may be assigned as the value of M1 of the feature map 430.
- pixel values of I3, I4, I5, I10, I11, I12, I17, I18, and I19 of the first image 115 and F1 and F2 of the filter kernel 410 , F3, F4, F5, F6, F7, F8, and F9 are each multiplied, and a value obtained by combining the product values may be assigned as the value of M2 of the feature map 430.
- a convolution operation between pixel values in the first image 115 and parameters of the filter kernel 410 is performed.
- a feature map 430 having a predetermined size may be obtained.
- parameters of the neural network 300 for upscaling through training of the neural network 300 for upscaling for example, parameters of a filter kernel used in convolution layers of the neural network 300 for upscaling (e.g., F1, F2, F3, F4, F5, F6, F7, F8, and F9 of the filter kernel) may be optimized.
- a filter kernel used in convolution layers of the neural network 300 for upscaling e.g., F1, F2, F3, F4, F5, F6, F7, F8, and F9 of the filter kernel
- the AI setting unit 238 determines an upscaling target according to an application used to acquire the first image 115, and converts parameters corresponding to the determined upscaling target to convolutional layers of the neural network 300 for upscaling. It can be determined by the parameters of the filter kernel used in
- the convolutional layers included in the upscaling neural network 300 may perform processing according to the convolutional operation process described in relation to FIG. 4, but the convolutional operation process described in FIG. 4 is only an example, and the embodiments Not limited to this.
- the feature maps output from the first convolution layer 310 are input to the first activation layer 320 .
- the first activation layer 320 may assign non-linear characteristics to each feature map.
- the first activation layer 320 may include, but is not limited to, a sigmoid function, a Tanh function, a Rectified Linear Unit (ReLU) function, and the like.
- Giving the nonlinear characteristics in the first activation layer 320 means changing and outputting some sample values of the feature map, which is an output of the first convolution layer 310 . At this time, the change is performed by applying nonlinear characteristics.
- the first activation layer 320 determines whether to transfer sample values of feature maps output from the first convolution layer 310 to the second convolution layer 330 . For example, certain sample values among sample values of feature maps are activated by the first activation layer 320 and transferred to the second convolution layer 330, and certain sample values are activated by the first activation layer 320. It is inactivated and not transmitted to the second convolution layer 330 . The unique characteristics of the first image 115 indicated by the feature maps are emphasized by the first activation layer 320 .
- the feature maps 325 output from the first activation layer 320 are input to the second convolution layer 330 .
- One of the feature maps 325 shown in FIG. 3 is a result of processing the feature map 430 described with reference to FIG. 4 in the first activation layer 320 .
- 3X3X4 displayed on the second convolution layer 330 exemplifies convolution processing on the input feature maps 325 using four filter kernels having a size of 3 ⁇ 3.
- the output of the second convolution layer 330 is input to the second activation layer 340 .
- the second activation layer 340 may impart nonlinear characteristics to input data.
- the feature maps 345 output from the second activation layer 340 are input to the third convolution layer 350 .
- 3X3X1 displayed on the third convolution layer 350 exemplifies convolution processing to generate one output image using one 3x3 filter kernel.
- the third convolution layer 350 is a layer for outputting a final image and generates one output using one filter kernel.
- the third convolution layer 350 may output the second image 125 through a convolution operation.
- the upscaling neural network 300 includes three convolution layers 310, 330, and 350 and two activation layers 320 and 340, but this is just one example, and implementation According to an example, the number of convolution layers and activation layers may be variously changed. Also, according to embodiments, the upscaling neural network 300 may be implemented through a recurrent neural network (RNN). This case means changing the CNN structure of the neural network 300 for upscaling according to the example of the present disclosure into an RNN structure.
- RNN recurrent neural network
- the AI upscaler 236 may include at least one arithmetic logic unit (ALU) for the above-described convolution operation and activation layer operation.
- An ALU may be implemented as a processor.
- the ALU may include a multiplier for performing a multiplication operation between sample values of the first image 115 or feature map and sample values of the filter kernel, and an adder for adding multiplication result values.
- the ALU is a multiplier that multiplies input sample values with weights used in a predetermined sigmoid function, Tanh function, or ReLU function, and compares the multiplication result with a predetermined value to obtain an input sample value.
- a comparator for determining whether to transfer to the next layer may be included.
- the AI setting unit 238 determines an up-scale target and the AI up-scale unit 236 AI up-scales the first image 115 according to the up-scale target will be described.
- the AI setting unit 238 may store a plurality of neural network setting information that can be set (or applicable) to the neural network for upscaling.
- the neural network setting information may include information on at least one of the number of convolution layers included in the upscale neural network, the number of filter kernels for each convolution layer, the size of each filter kernel, or a parameter of each filter kernel. .
- a plurality of neural network setting information may correspond to various upscaling targets, respectively, and a neural network for upscaling may operate based on neural network setting information corresponding to a specific upscaling target.
- neural networks for upscaling may have different structures according to neural network setting information.
- a neural network for upscaling may include three convolutional layers according to neural network setting information, and a neural network for upscaling may include four convolutional layers according to other neural network setting information.
- the neural network setting information may include only parameters of a filter kernel used in an upscaling neural network.
- the structure of the neural network for upscaling is not changed, but only the parameters of the filter kernel may be changed according to the neural network setting information.
- the AI setting unit 238 may select neural network setting information for AI upscaling of the first image 115 from among a plurality of neural network setting information.
- a plurality of neural network setting information corresponds to a plurality of applications.
- a plurality of neural network setting information may correspond 1:1 to a plurality of applications.
- 'P' neural network setting information corresponds to 'A' application
- 'Q' neural network setting information corresponds to 'B' application
- 'R' neural network setting information corresponds to ' It can correspond to C' application.
- FIG. 5 shows only three applications and three pieces of neural network setting information, the number of applications and neural network setting information corresponding to each other may vary.
- the AI setting unit 238 checks which application was used to obtain the first image 115 and , Neural network setting information corresponding to the identified application must be obtained. If the first image 115 is acquired through the 'A' application, but neural network setting information other than 'P' neural network setting information is selected for the operation of the neural network for upscaling, image degradation by the 'A' application characteristics are not adequately compensated.
- neural network setting information for AI upscaling of the first image 115 is selected from among a plurality of neural network setting information
- the selected neural network setting information is transmitted to the AI upscaling unit 236, and an up-scaling operation is performed according to the selected neural network setting information.
- the first image 115 may be processed based on the neural network for scaling.
- the AI upscaler 236 receives neural network setting information from the AI setter 238, the first convolution layer 310 of the upscale neural network 300 shown in FIG.
- the convolution layer 330 and the third convolution layer 350 the number of filter kernels included in each layer and parameters of the filter kernels are set to values included in the acquired neural network setting information.
- the AI upscaling unit 236 sets the parameters of the 3 X 3 filter kernel used in any one convolution layer of the upscaling neural network 300 shown in FIG. 3 to ⁇ 1, 1, 1, 1, 1, 1, 1, 1, 1 ⁇ , and if there is a change in the neural network configuration information, the parameters of the filter kernel included in the changed neural network configuration information ⁇ 2, 2, 2, 2, 2, 2, 2, 2, 2 ⁇ .
- the AI setting unit 238 may select neural network setting information for AI upscaling of the first image 115 from among a plurality of neural network setting information based on AI data provided from the parsing unit 232 .
- the AI data may include application information necessary for acquiring the first image 115 .
- the application information may include identification information for distinguishing the application from other applications, such as the name of the application.
- the AI setting unit 238 identifies an application corresponding to the first image 115 from the AI data and , It is possible to select neural network setting information corresponding to the identified application among a plurality of neural network setting information.
- the AI setting unit 238 may select 'P' neural network setting information shown in FIG. 5 .
- the AI setting unit 238 may select the 'Q' neural network setting information shown in FIG. 5 .
- the AI setting unit 238 may identify an application executed in the electronic device 200 to acquire the first image 115 and select neural network setting information corresponding to the identified application. Since the AI setting unit 238 directly identifies an application executed in the electronic device 200, it is possible to select neural network setting information suitable for the first image 115 even when AI data is not included in the AI-encoded data.
- the first image 115 is generated based on information related to the application. It is necessary to predict through which application it is provided. This will be described with reference to FIG. 6 .
- FIG. 6 is an exemplary diagram illustrating information related to various applications for selection of neural network setting information.
- the electronic device 200 may store information related to various applications required to obtain an image from a provider.
- application-related information includes information about a provider that operates the application, address information to be accessed through a web browser to obtain an image from a provider corresponding to the application, or an image obtainable from the application. It may include at least one of information related to.
- the AI setting unit 238 identifies the provider of the first image 115 from the AI data, and Among the neural network setting information, neural network setting information corresponding to the identified provider may be selected.
- the AI setting unit 238 may be configured to obtain the first image 115 through address information accessed by the web browser and information related to a plurality of applications. Neural network setting information may be selected by comparing address information.
- the AI setting unit 238 may perform AI up-scaling of the first image 115 as shown in FIG. 5 .
- P' Neural network setting information can be selected.
- the AI setting unit 238 displays 'R' shown in FIG. 5 for AI upscaling of the first image 115. You can select neural network setting information.
- neural network setting information may be selected through application information, provider information, and/or an address accessed by a web browser.
- the AI setting unit 238 may select neural network setting information for AI upscaling of the first image 115 through information related to the first image 115 .
- a method of selecting neural network setting information through information related to the first image 115 may be useful in the following cases.
- the first video 115 After receiving and storing the first video 115 through the network, the first video 115 is played back with a general-purpose video playback program according to a request from a user or an instruction.
- the information related to the first image 115 and the information of the image provided from the application By comparing, it is to predict whether the first image 115 is similar to an image provided through which application.
- the video-related information includes the file name of the video, whether the video is still or moving, resolution of the video, bit rate of video data corresponding to the video, codec type used for encoding the video, and quantization parameter used for encoding the video. (Quantization Parameter), the sharpness of the image, the artifact characteristics of the image, or the type of artifact in the image (eg, ringing artifact, blurring artifact, or block artifact, etc.) At least may contain one.
- Encoding-related information provided to the AI setting unit 238 from the above-described decoder 234 may be used as image-related information.
- the characteristics of images and image data provided from a plurality of applications may be analyzed in advance and stored in the electronic device 200 .
- the AI setting unit 238 compares information related to images provided from a plurality of applications with information related to the first image 115 to determine whether the first image 115 is similar to an image provided from an application, It is possible to predict from which application it is provided.
- the resolution of the first image 115 is HD
- the bit rate of image data corresponding to the first image 115 is 10 Mbps
- the first image 115 is HEVC. If encoded, the AI setting unit 238 determines that the first image 115 is similar to the image obtained through the 'C' application, and among the plurality of neural network setting information shown in FIG. 5, 'R' neural network setting information can choose
- the AI setting unit 238 determines that the first image 115 has been obtained through the 'A' application and , 'P' neural network setting information can be selected from among the plurality of neural network setting information shown in FIG. 5 .
- the AI setting unit 238 determines the first image ( 115) is obtained through the 'B' application, and 'Q' neural network setting information may be selected from among the plurality of neural network setting information shown in FIG. 5 .
- the electronic device 200 when receiving and storing the first image 115 through an application, may store information about an application executed to obtain the first image 115 together. When the first image 115 needs to be reproduced, the electronic device 200 checks the stored application information upon receiving the first image 115 and provides neural network setting information corresponding to the application of the first image 115. you can also choose
- an image is composed of two images, an image of a luminance component and an image of a chroma component.
- the size of a luminance component image is 4 times larger than that of a chroma component image.
- an image includes an image of a first color component (eg, luminance), an image of a second color component (eg, chroma (Cb)), and an image of a third color component (eg, chroma (Cr)). It is made of images of , and when their sizes are not the same, a method for efficiently AI upscaling the corresponding image is required.
- an image includes an image of a first color component, an image of a second color component, and an image of a third color component, and the size of the image of the first color component is the size of the image of the second color component and the third color component.
- FIG. 7 is an exemplary diagram illustrating an upscaling neural network 700 for AI upscaling of the first image 115 .
- the upscaling neural network 700 includes a first scaler 710, a first convolution layer 720, a second convolution layer 730, a third convolution layer 740, It may include a depth to space layer 750, a second scaler 760, and a clipping layer 770.
- the first image 115 includes a first sub-image 702 of a first color component, a second sub-image 704 of a second color component, and a third sub-image 706 of a third color component.
- the first color component may include a luminance component
- the second color component may include a chroma (Cb) component
- the third color component may include a chroma (Cr) component.
- the size of the first sub-image 702 is four times larger than that of the second sub-image 704 and the third sub-image 706 .
- the second sub-image 704 and the third sub-image 706 are input to the first scaler 710 .
- the first scaler 710 increases the size of the second sub-image 704 and the third sub-image 706 .
- the first scaler 710 may scale the size of the second sub-image 704 and the third sub-image 706 to be the same as that of the first sub-image 702 .
- the reason for scaling the size of the second sub-image 704 and the third sub-image 706 to be the same as that of the first sub-image 702 is the result of convolution operation using a filter kernel of a specific size and images for each channel. is to sum them up.
- one filter kernel is convoluted with each of the input images, and the results of the convolution operation with each image are summed together to obtain one feature map corresponding to the filter kernel. . If the sizes of the images to be subjected to the convolution operation are not the same, the sizes of the convolution operation results are also not the same, and accordingly, summing up the convolution operation results becomes difficult.
- the first scaler 710 may increase the size of the second sub-image 704 and the third sub-image 706 on a legacy basis rather than an AI basis.
- the legacy-based scaling may include at least one of a bilinear scale, a bicubic scale, a lanczos scale, or a stair step scale.
- the first scaler 710 may be implemented as a convolution layer.
- the first sub-image 702 , the scaled second sub-image 704 , and the scaled third sub-image 706 are input to the first convolution layer 720 .
- 3X3X32 displayed on the first convolution layer 720 shown in FIG. 7 exemplifies that convolution is performed on images of three channels using 32 filter kernels having a size of 3 ⁇ 3.
- 32 feature maps are generated by 32 filter kernels.
- Feature maps output from the first convolution layer 720 are input to the second convolution layer 730 .
- 3X3X32 displayed on the second convolution layer 730 exemplifies convolution processing on feature maps output from the previous layer 720 using 32 filter kernels having a size of 3 ⁇ 3.
- 32 feature maps are generated by 32 filter kernels.
- the feature maps output from the second convolution layer 730 are input to the third convolution layer 740 .
- 3X3X(6n) (n is a natural number) displayed on the third convolution layer 740 performs convolution processing on the feature maps output from the previous layer 730 using 6n filter kernels having a size of 3 x 3. exemplify that As a result of the convolution process, 6n feature maps are generated by 6n filter kernels.
- Feature maps output from the third convolution layer 740 are input to the depth space transformation layer 750 .
- the depth space transformation layer 750 reduces the number of feature maps output from the previous layer 740 . Specifically, the depth space transform layer 750 may reduce the number of channels by arranging samples in a channel direction on a spatial domain of one channel.
- FIG. 8 is a diagram for explaining the operation of the depth space conversion layer 750 shown in FIG. 7 .
- FIG. 8 shows a first feature map 810, a second feature map 820, a third feature map 830, and a fourth feature map 840 corresponding to four channels by a depth space conversion layer 750.
- a situation in which one fifth feature map 850 is output through processing is illustrated.
- the number of channels input to the depth space transform layer 750 and the number of channels output from the depth space transform layer 750 may be variously changed.
- the first feature map 810, the second feature map 820, the third feature map 830, and the fourth feature map 840 are included in the depth space conversion layer 750.
- the samples 815, 825, 835, and 845 may be generated.
- a method of arranging samples of input feature maps may vary. 8 shows samples 815, 825, 835, and 845 at the same location in the first feature map 810, the second feature map 820, the third feature map 830, and the fourth feature map 840. Although shown as being arranged adjacent to each other within the fifth feature map 850, the first feature map 810, the second feature map 820, the third feature map 830, and the fourth feature map 840 Another sample may be positioned between the samples 815, 825, 835, and 845 at the same location within the sample.
- the depth space conversion layer 750 generates a feature map 752 of a first color component by combining 4n feature map samples among 6n input feature maps. Then, the depth space conversion layer 750 generates a feature map 754 of a second color component by combining samples of the remaining n feature maps among the 6n feature maps, and generates a feature map 754 of the remaining n feature maps among the 6n feature maps. The samples are combined to create a feature map 756 of the third color component.
- the size of the first sub-image 702 is 4 times larger than that of the second sub-image 704 and the third sub-image 706, so this size
- a feature map 752 of the first color component is generated using 4n feature maps
- a feature map 754 of the second color component and the feature map 754 of the third color component are generated using n feature maps, respectively.
- the size of the feature map 752 of the first color component is proportional to the feature map 754 of the second color component and the feature map 754 of the third color component. may be as large as 4 times the feature map 756 of .
- the first sub-image 702 , the second sub-image 704, and the third sub-image 706 may be input to the second scaler 760 .
- the second scaler 760 scales the sizes of the first sub-image 702, the second sub-image 704, and the third sub-image 706 according to the scaling factor of the neural network 700 for upscaling.
- the scaling factor of the neural network 700 for upscaling means a ratio between the size of the first image 115 and the size of the second image 125 .
- the second scaler 760 may increase the size of the first sub-image 702, the second sub-image 704, and the third sub-image 706 based on a legacy basis rather than an AI basis.
- the legacy-based scaling may include at least one of a bilinear scale, a bicubic scale, a lanczos scale, or a stair step scale.
- the second scaler 760 may be implemented as a convolution layer.
- the first sub-image 702, the second sub-image 704, and the third sub-image 706 scaled by the second scaler 760, respectively, have a first color component feature map 752, a second color component component's feature map 754 and the third color component's feature map 756 respectively.
- images and feature maps of the same color component may be added.
- a prediction version of the second image 125 is obtained through the second scaler 760 of the skip connection structure, and is obtained by the first scaler 710 or the depth space transform layer 750. 2
- a residual version of image 125 is obtained.
- the second image 125 may be obtained by adding the predicted version and the residual version of the second image 125 .
- the clipping layer 770 prevents overshooting of a summation result of images output from the second scaler 760 and feature maps output from the depth space transformation layer 750 .
- the clipping layer 770 may clip sample values resulting from summing the images output from the second scaler 760 and the feature maps output from the depth space conversion layer 750 according to Equation 1 below.
- Equation 1 res denotes a residual version of the second image 125, that is, samples of feature maps output from the depth space transform layer 750, and skip denotes a predicted version of the second image 125, that is, , which means samples of images output from the second scaler 760.
- ratio_lowerbound and ratio_upperbound are predetermined values, and are values for limiting sample values of the second image 125 within a certain range.
- the feature map 752 of the first sub-image 702 (ie, skip) scaled by the second scaler 760 and the first color component output by the depth space transform layer 750 ) (ie, res) is clipped within the range of the value obtained by multiplying the first sub-image 702 (ie, skip) scaled by the second scaler 760 by ratio_lowerbound and ratio_upperbound.
- the second sub-image 704 scaled by the second scaler 760 (ie, skip) and the feature map 754 of the second color component output by the depth space transform layer 750 (ie, res ) is clipped within the range of a value obtained by multiplying the second sub-image 704 (ie, skip) scaled by the second scaler 760 by ratio_lowerbound and ratio_upperbound.
- the third sub-image 706 scaled by the second scaler 760 (ie, skip) and the feature map 756 of the third color component output by the depth space transform layer 750 (ie, res ) is clipped within the range of the value obtained by multiplying the third sub-image 706 (ie, skip) scaled by the second scaler 760 by ratio_lowerbound and ratio_upperbound.
- a fourth sub-image 782 of a first color component, a fifth sub-image 784 of a second color component, and a sixth sub-image 786 of a third color component are obtained.
- 2 images 125 may be acquired.
- the second scaler 760 and/or the clipping layer 770 may be omitted from the upscaling neural network 700, and the number of convolution layers may be variously changed.
- the number of convolution layers may be variously changed.
- one or more activation layers described in relation to FIG. 3 may be included in the neural network 700 for upscaling.
- the size and number of filter kernels shown in FIG. 7 are only examples and may be changed in various ways according to implementation examples.
- the upscaling neural network 700 shown in FIG. 7 is for a case where the size of the first sub-image 702 is 4 times larger than the size of the second sub-image 704 and the third sub-image 706, If the size of the first sub-image 702 is k times larger than the size of the second sub-image 704 and the third sub-image 706, the scaling factor of the first scaler 710 becomes k. And, if a feature map is used to generate the feature map 752 of the first color component in the depth space conversion layer 750, the feature map 754 of the second color component and the feature map of the third color component ( 756), respectively, a/k feature maps may be used. To this end, k+(a/k)+(a/k) feature maps may be output from the previous layer 740 .
- FIG. 9 is an exemplary diagram illustrating an upscaling neural network 900 according to another embodiment for AI upscaling of the first image 115 .
- the upscaling neural network 900 includes a first convolution layer 910, a second convolution layer 920, a third convolution layer 930, and a depth space conversion layer 940. , a scaler 950 and a clipping layer 960.
- the first image 115 includes a first sub-image 702 of a first color component, a second sub-image 704 of a second color component, and a third sub-image 706 of a third color component.
- the first color component may include a luminance component
- the second color component may include a chroma (Cb) component
- the third color component may include a chroma (Cr) component.
- the size of the first sub-image 702 is four times larger than that of the second sub-image 704 and the third sub-image 706 . Due to this size difference, only the first sub-image 702 among the first sub-image 702 , the second sub-image 704 , and the third sub-image 706 is input to the first convolution layer 910 .
- 3X3X32 displayed on the first convolution layer 910 shown in FIG. 9 exemplifies that convolution processing is performed on images of one channel using 32 filter kernels having a size of 3 ⁇ 3.
- 32 feature maps are generated by 32 filter kernels.
- Feature maps output from the first convolution layer 910 are input to the second convolution layer 920 .
- 3X3X32 displayed on the second convolution layer 920 exemplifies convolution processing on feature maps output from the previous layer using 32 filter kernels having a size of 3 ⁇ 3.
- 32 feature maps are generated by 32 filter kernels.
- Feature maps output from the second convolution layer 920 are input to the third convolution layer 930 .
- 3X3X16 displayed on the third convolution layer 930 exemplifies convolution processing on feature maps output from the previous layer 920 using 16 filter kernels having a size of 3 ⁇ 3.
- 16 feature maps are generated by 16 filter kernels.
- Feature maps output from the third convolution layer 930 are input to the depth space transformation layer 940 .
- the depth space conversion layer 940 generates a feature map 942 of a first color component by combining samples of feature maps output from the third convolution layer 930 .
- the first sub-image 702 is input to the first convolution layer 910, the first sub-image 702, the second sub-image 704, and the third sub-image 706 are generated by the scaler 950. ) can be entered.
- the scaler 950 scales the sizes of the first sub-image 702, the second sub-image 704, and the third sub-image 706 according to the scaling factor of the neural network 900 for upscaling.
- the scaler 950 may increase the size of the first sub-image 702, the second sub-image 704, and the third sub-image 706 based on a legacy basis rather than an AI basis.
- the legacy-based scaling may include at least one of a bilinear scale, a bicubic scale, a lanczos scale, or a stair step scale.
- the scaler 950 may be implemented as a convolution layer.
- the first sub-image 702 scaled by the scaler 950 is added to the feature map 942 of the first color component. Since the second sub-image 704 and the third sub-image 706 scaled by the scaler 950 do not have feature maps of the same color component, the first sub-image 702 scaled by the scaler 950 does not exist. ) and the feature map 942 of the first color component are transferred to the clipping layer 960 as they are.
- a prediction version of the fourth sub-image 972 of the first color component is obtained through the scaler 950 of the skip connection structure, and the first convolution layer 910 to the depth space transformation layer In operation 940, a residual version of the fourth sub-image 972 is obtained.
- the fourth sub-image 972 constituting the second image 125 may be obtained by adding the prediction version of the fourth sub-image 972 and the residual version of the fourth sub-image 972 .
- the clipping layer 960 prevents overshooting of a summation result of the images output from the scaler 950 and the feature map output from the depth space conversion layer 940 .
- the clipping layer 960 may clip sample values obtained by summing the images output from the scaler 950 and the feature map output from the depth space conversion layer 940 according to Equation 1 above.
- a fourth sub-image 972 of a first color component, a fifth sub-image 974 of a second color component, and a sixth sub-image 976 of a third color component are obtained.
- 2 images 125 may be acquired.
- the scaler 950 and/or the clipping layer 960 may be omitted from the upscaling neural network 900, and the number of convolution layers may be variously changed. Also, although not shown in FIG. 9 , one or more activation layers described in relation to FIG. 3 may be included in the upscaling neural network 900 . In addition, the size and number of filter kernels shown in FIG. 9 are only examples and may be changed in various ways according to implementation examples.
- the neural network 900 for upscaling shown in FIG. 9 performs convolution only on the first sub-image 702, the amount of computation can be significantly reduced compared to the neural network 700 for upscaling shown in FIG. 7. .
- FIG. 10 is a diagram for explaining a method of training the neural network 1000 for upscaling.
- the upscaling neural network 1000 may have the structure of the upscaling neural network 300, 700, or 900 described above. Depending on implementation, the neural network 1000 for upscaling may have various structures including one or more convolutional layers.
- AI upscaling of the first image 115 is performed with neural network setting information corresponding to an application used to acquire the first image 115 . Since the AI upscaling of the first image 115 depends on the application used to acquire the first image 115, acquisition of neural network setting information specialized for each application is required.
- a first training image 1010 means an image obtained from an application
- a second training image 1030 (second training image) is for upscaling from the first training image 1010. It means an AI upscaled image through the neural network 1000.
- the first training image 1010 may be generated by processing an original training image 1050 by an application or a provider operating the application.
- the original training image 1050 is processed (eg, resolution conversion and encoding) by an application or a provider that operates the application to generate a first training image 1010, and the electronic device 200 or a training device described later
- the first training image 1010 may be provided to the electronic device 200 or the training device 1100 to be described later.
- the electronic device 200 or the training device 1100 uploads the original training image 1050 to a provider (or a server operated by the provider) and , the first training image 1010 processed by the provider may be acquired through an application.
- the upscaling neural network 1000 is trained based on loss information 1070 corresponding to a comparison result between the second training image 1030 and the original training image 1050.
- the upscaling neural network 1000 generates a second training image 1030 by AI upscaling the first training image 1010 according to preset neural network setting information.
- the neural network setting information of the upscaling neural network 1000 is updated according to the loss information 1070 corresponding to the comparison result between the second training image 1030 and the original training image 1050.
- the neural network 1000 for upscaling may update neural network setting information, eg, parameters, so that the loss information 1070 is reduced or minimized.
- the upscaling neural network 1000 may update neural network setting information so that the loss information 1070 is reduced or minimized.
- the loss information 1070 corresponding to the comparison result between the second training image 1030 and the original training image 1050 is an L1-norm value for the difference between the second training image 1030 and the original training image 1050, L2-norm value, SSIM (Structural Similarity) value, PSNR-HVS (Peak Signal-To-Noise Ratio-Human Vision System) value, MS-SSIM (Multiscale SSIM) value, VIF (Variance Inflation Factor) value, or VMAF (Video Video It may include at least one of Multimethod Assessment Fusion) values.
- first training images 1010 provided from a plurality of applications may be used for training.
- neural network setting information corresponding to the first application may be obtained.
- neural network setting information corresponding to the second application may be obtained as a result of training the neural network 1000 for upscaling based on the first training image 1010 obtained through the second application. That is, as the upscaling neural network 1000 is individually trained from the first training images 1010 provided through each application, a plurality of neural network setting information respectively corresponding to a plurality of applications may be obtained.
- 11 is a diagram for explaining a training process of the upscaling neural network 1000 by the training apparatus 1100.
- Training of the upscaling neural network 1000 described with reference to FIG. 10 may be performed by the training apparatus 1100 .
- the training device 1100 may be, for example, the electronic device 200 or a separate server.
- Neural network setting information of the upscaling neural network 1000 obtained as a result of training is stored in the electronic device 200 .
- the training device 1100 initially sets neural network setting information of the upscaling neural network 1000 (S1110). Accordingly, the upscaling neural network 1000 may operate according to predetermined neural network setting information.
- Neural network setting information includes information on at least one of the number of convolution layers included in the upscale neural network 1000, the number of filter kernels for each convolution layer, the size of filter kernels for each convolution layer, or parameters of each filter kernel. can include
- the training device 1100 inputs the first training image 1010 acquired through the application to the upscaling neural network 1000 (S1120).
- the upscaling neural network 1000 processes the first training image 1010 according to the initially set neural network setting information, and outputs the AI upscaled second training image 1030 from the first training image 1010 ( S1130).
- the training apparatus 1100 calculates loss information 1070 based on the second training image 1030 and the original training image 1050 (S1140).
- the training apparatus 1100 provides loss information 1070 to the upscaling neural network 1000, and the upscaling neural network 1000 is a neural network initially set through a back propagation process based on the loss information 1070.
- Setting information is updated (S1150).
- the training apparatus 1100 and the neural network for upscaling 1000 update the neural network setting information while repeating processes S1120 to S1150 until the loss information 1070 is minimized.
- the upscaling neural network 1000 operates according to the updated neural network setting information in the previous process.
- FIG. 12 is a flowchart illustrating an AI decoding method by the electronic device 200 according to an embodiment.
- step S1210 the electronic device 200 acquires the first image 115.
- the electronic device 200 may acquire image data of the first image 115 and decode the image data to obtain the first image 115 .
- the electronic device 200 may obtain the first image 115 through a network or may obtain the first image 115 from a storage medium of the electronic device 200 in which the first image 115 is stored. .
- step S1220 the electronic device 200 determines whether AI upscaling of the first image 115 is required.
- the electronic device 200 may determine whether AI upscaling is necessary based on the resolution of the first image 115 .
- the electronic device 200 determines that AI upscaling is not necessary, and determines that the resolution of the first image 115 is the first predetermined resolution. If it is greater than , it may decide that AI upscaling is necessary.
- the electronic device 200 determines that AI upscaling is not required, and determines that the resolution of the first image 115 is the second predetermined resolution. If it is smaller than , it can be determined that AI upscaling is necessary.
- the electronic device 200 determines that AI upscaling is necessary, and the first image 115 If the resolution is out of a range between the first resolution and the second resolution, it may be determined that AI upscaling is not required.
- step S1230 the electronic device 200 identifies an application used to obtain the first image 115.
- the electronic device 200 includes application information required (or used) to obtain the first image 115, address information accessed by the web browser to obtain the first image 115, and information about the first image 115.
- An application corresponding to the first image 115 may be identified using at least one of provider information and information related to the first image 115 .
- step S1240 the electronic device 200 selects neural network setting information corresponding to the application identified in step S1230 from among a plurality of pre-stored neural network setting information.
- step S1250 the electronic device 200 obtains a second image 125 by AI up-scaling the first image 115 with an upscaling neural network that operates according to the selected neural network setting information.
- step S1260 the electronic device 200 outputs the second image 125.
- the electronic device 200 may output the second image 125 to a display, and the display may reproduce the second image 125 after post-processing it, if necessary.
- step S1270 the electronic device 200 outputs the first image 115.
- the electronic device 200 may output the first image 115 to a display, and the display may reproduce the first image 115 after post-processing it, if necessary.
- the above-described embodiments of the present disclosure can be written as programs or instructions that can be executed on a computer, and the written programs or instructions can be stored in a storage medium.
- the device-readable storage medium may be provided in the form of a non-transitory storage medium.
- 'non-temporary storage medium' only means that it is a tangible device and does not contain signals (e.g., electromagnetic waves), and this term refers to the case where data is stored semi-permanently in the storage medium and temporary It does not discriminate if it is saved as .
- a 'non-temporary storage medium' may include a buffer in which data is temporarily stored.
- the method according to various embodiments disclosed in this document may be included and provided in a computer program product.
- Computer program products may be traded between sellers and buyers as commodities.
- a computer program product is distributed in the form of a device-readable storage medium (e.g. compact disc read only memory (CD-ROM)), or through an application store (e.g. Play Store TM ) or between two user devices ( It can be distributed (eg downloaded or uploaded) online, directly between smartphones.
- a part of a computer program product eg, a downloadable app
- a device-readable storage medium such as a memory of a manufacturer's server, an application store's server, or a relay server. It can be temporarily stored or created temporarily.
- the model related to the neural network described above may be implemented as a software module.
- the neural network model may be stored in a computer-readable recording medium.
- the neural network model may be integrated in the form of a hardware chip and become a part of the electronic device 200 described above.
- a neural network model may be built in the form of a dedicated hardware chip for artificial intelligence, or as part of an existing general-purpose processor (eg CPU or application processor) or graphics-only processor (eg GPU). It could be.
- the neural network model may be provided in the form of downloadable software.
- a computer program product may include a product in the form of a software program (eg, a downloadable application) that is distributed electronically by a manufacturer or through an electronic marketplace. For electronic distribution, at least a portion of the software program may be stored on a storage medium or may be temporarily created.
- the storage medium may be a storage medium of a manufacturer or a server of an electronic market or a relay server.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Quality & Reliability (AREA)
- Medical Informatics (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (15)
- 영상을 재생하는 전자 장치에 있어서,디스플레이; 및프로세서를 포함하고,상기 프로세서는,제 1 영상을 획득하고,상기 제 1 영상을 획득하기 위해 이용된 어플리케이션을 식별하고,복수의 신경망(Neural Network) 설정 정보 중 상기 식별된 어플리케이션에 대응하는 신경망 설정 정보를 선택하고,상기 선택된 신경망 설정 정보가 적용된 업스케일용 신경망을 이용하여 상기 제 1 영상을 AI (Artificial Intelligence) 업스케일하여 제 2 영상을 획득하고,상기 획득한 제 2 영상을 상기 디스플레이를 통해 출력하는, 전자 장치.
- 제1항에 있어서,상기 프로세서는,복수의 어플리케이션들을 통해 제공되는 영상들에 대한 정보와 상기 제 1 영상 관련 정보를 비교하여 상기 신경망 설정 정보를 선택하는, 전자 장치.
- 제1항에 있어서,상기 프로세서는,상기 제 1 영상을 획득하기 위해 웹 브라우저가 접속한 주소를 확인하고,복수의 어플리케이션들에 대응하는 주소들과 상기 확인된 주소를 비교하여 상기 신경망 설정 정보를 선택하는, 전자 장치.
- 제1항에 있어서,상기 업스케일용 신경망은,원본 훈련 영상과, 복수의 어플리케이션들을 통해 제공되는 제 1 훈련 영상으로부터 상기 업스케일용 신경망에 의해 AI 업스케일된 제 2 훈련 영상의 비교에 기초하여 훈련되는, 전자 장치.
- 제4항에 있어서,상기 복수의 어플리케이션들 중 제 1 어플리케이션에 의해 제공되는 제 1 훈련 영상에 기반한 상기 업스케일용 신경망의 훈련을 통해 상기 제 1 어플리케이션에 대응하는 신경망 설정 정보가 획득되고,상기 복수의 어플리케이션들 중 제 2 어플리케이션에 의해 제공되는 제 1 훈련 영상에 기반한 상기 업스케일용 신경망의 훈련을 통해 상기 제 2 어플리케이션에 대응하는 신경망 설정 정보가 획득되는, 전자 장치.
- 제1항에 있어서,상기 프로세서는,상기 획득한 제 1 영상의 해상도와 미리 결정된 해상도의 비교에 기초하여 AI 업스케일의 수행 여부를 결정하고,상기 AI 업스케일을 수행하지 않는다는 결정에 기초하여, 상기 제 1 영상을 상기 디스플레이를 통해 출력하고,상기 AI 업스케일을 수행한다는 결정에 기초하여, 상기 제 2 영상을 상기 디스플레이를 통해 출력하는, 전자 장치.
- 제1항에 있어서,상기 제 1 영상은, 제 1 컬러 성분의 제 1 서브 영상, 제 2 컬러 성분의 제 2 서브 영상 및 제 3 컬러 성분의 제 3 서브 영상을 포함하고,상기 제 1 서브 영상의 크기는, 상기 제 2 서브 영상 크기 및 상기 제 3 서브 영상의 크기보다 크며,상기 업스케일용 신경망은,상기 제 2 서브 영상의 크기 및 상기 제 3 서브 영상의 크기를 상기 제 1 서브 영상과 동일하게 스케일링하는 제 1 스케일링 레이어;상기 제 1 서브 영상, 상기 스케일링된 제 2 서브 영상 및 상기 스케일링된 제 3 서브 영상을 컨볼루션 처리하는 적어도 하나의 컨볼루션 레이어; 및상기 적어도 하나의 컨볼루션 레이어로부터 출력되는 특징 맵들의 일부를 조합하여 상기 제 1 컬러 성분의 특징 맵, 상기 제 2 컬러 성분의 특징 맵 및 상기 제 3 컬러 성분의 특징 맵을 생성하는 깊이 공간 변환 레이어(depth to space layer)를 포함하되,상기 프로세서는,상기 제 1 컬러 성분의 특징 맵에 대응하는 제 1 컬러 성분의 제 4 서브 영상, 상기 제 2 컬러 성분의 특징 맵에 대응하는 제 2 컬러 성분의 제 5 서브 영상 및 상기 제 3 컬러 성분의 특징 맵에 대응하는 제 3 컬러 성분의 제 6 서브 영상을 포함하는 제 2 영상을 획득하는, 전자 장치.
- 제7항에 있어서,상기 제 1 서브 영상의 크기는, 상기 제 2 서브 영상의 크기 및 상기 제 3 서브 영상의 크기보다 n배(n은 자연수)만큼 크고,상기 제 1 컬러 성분의 특징 맵을 생성하는데 이용되는 특징 맵들의 개수는, 상기 제 2 컬러 성분의 특징 맵 및 상기 제 3 컬러 성분의 특징 맵을 생성하는데 이용되는 특징 맵들의 개수보다 n배만큼 많은, 전자 장치.
- 제7항에 있어서,상기 업스케일용 신경망은,상기 제 1 서브 영상, 상기 제 2 서브 영상 및 상기 제 3 서브 영상을 상기 업스케일용 신경망의 스케일링 배율에 기초하여 스케일링하는 제 2 스케일링 레이어를 더 포함하고,상기 프로세서는,상기 스케일링된 제 1 서브 영상을 제 1 컬러 성분의 특징 맵에 더하여 상기 제 4 서브 영상을 획득하고, 상기 스케일링된 제 2 서브 영상을 상기 제 2 컬러 성분의 특징 맵에 더하여 상기 제 5 서브 영상을 획득하고, 상기 스케일링된 제 3 서브 영상을 상기 제 3 컬러 성분의 특징 맵에 더하여 상기 제 6 서브 영상을 획득하는, 전자 장치.
- 제7항에 있어서,상기 제 4 서브 영상, 상기 제 5 서브 영상 및 상기 제 6 서브 영상의 샘플 값들은, 미리 결정된 범위 내의 값으로 클리핑(clipping)되는, 전자 장치.
- 제1항에 있어서,상기 제 1 영상은, 제 1 컬러 성분의 제 1 서브 영상, 제 2 컬러 성분의 제 2 서브 영상 및 제 3 컬러 성분의 제 3 서브 영상을 포함하고,상기 업스케일용 신경망은,상기 제 1 서브 영상을 컨볼루션 처리하는 적어도 하나의 컨볼루션 레이어; 및상기 적어도 하나의 컨볼루션 레이어로부터 출력되는 특징 맵들을 조합하여 상기 제 1 컬러 성분의 특징 맵을 생성하는 깊이 공간 변환 레이어(depth to space layer)를 포함하되,상기 프로세서는,상기 제 1 컬러 성분의 특징 맵에 대응하는 제 1 컬러 성분의 제 4 서브 영상, 상기 제 2 서브 영상으로부터 상기 업스케일용 신경망의 스케일링 배율만큼 스케일링된 상기 제 2 컬러 성분의 제 5 서브 영상, 및 상기 제 3 서브 영상으로부터 상기 업스케일용 신경망의 스케일링 배율만큼 스케일링된 상기 제 3 컬러 성분의 제 6 서브 영상을 포함하는 제 2 영상을 획득하는, 전자 장치.
- 영상을 재생하는 방법에 있어서,제 1 영상을 획득하는 단계;상기 제 1 영상을 획득하기 위해 이용된 어플리케이션을 식별하는 단계;복수의 신경망(Neural Network) 설정 정보 중 상기 식별된 어플리케이션에 대응하는 신경망 설정 정보를 선택하는 단계;상기 선택된 신경망 설정 정보가 적용된 업스케일용 신경망을 이용하여 상기 제 1 영상을 AI(Artificial Intelligence) 업스케일하여 제 2 영상을 획득하는 단계; 및상기 획득한 제 2 영상을 디스플레이로 제공하는 단계를 포함하는, 영상 재생 방법.
- 제12항에 있어서,상기 신경망 설정 정보를 선택하는 단계는,복수의 어플리케이션을 통해 제공되는 영상들에 대한 정보와 상기 제 1 영상 관련 정보의 비교에 기초하여 상기 신경망 설정 정보를 선택하는 단계를 포함하는, 영상 재생 방법.
- 제12항에 있어서,상기 신경망 설정 정보를 선택하는 단계는,상기 제 1 영상을 획득하기 위해 웹 브라우저가 접속한 주소를 식별하는 단계; 및복수의 어플리케이션들에 대응하는 주소들과 상기 식별된 주소를 비교하여 상기 신경망 설정 정보를 선택하는 단계를 포함하는, 영상 재생 방법.
- 컴퓨터에 의해 실행될 수 있는 영상 재생 방법을 위한 프로그램이 기록된 컴퓨터로 읽을 수 있는 기록매체에 있어서,상기 영상 재생 방법은,제 1 영상을 획득하는 단계;상기 제 1 영상을 획득하기 위해 이용된 어플리케이션을 식별하는 단계;복수의 신경망(Neural Network) 설정 정보 중 상기 식별된 어플리케이션에 대응하는 신경망 설정 정보를 선택하는 단계;상기 선택된 신경망 설정 정보가 적용된 업스케일용 신경망을 이용하여 상기 제 1 영상을 AI(Artificial Intelligence) 업스케일하여 제 2 영상을 획득하는 단계; 및상기 획득한 제 2 영상을 디스플레이로 제공하는 단계를 포함하는, 기록매체.
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202280040979.8A CN117480507A (zh) | 2021-08-10 | 2022-08-09 | 根据应用基于ai来回放图像的电子装置以及用于通过该电子装置来回放图像的方法 |
| EP22856176.7A EP4322057A4 (en) | 2021-08-10 | 2022-08-09 | ELECTRONIC DEVICE FOR READING AN IMAGE BASED ON AI ACCORDING TO AN APPLICATION, AND METHOD FOR READING AN IMAGE USING SAME |
| US17/893,754 US12223617B2 (en) | 2021-08-10 | 2022-08-23 | Electronic apparatus and method for reproducing image based on artificial intelligence according to application |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR10-2021-0105605 | 2021-08-10 | ||
| KR1020210105605A KR20230023460A (ko) | 2021-08-10 | 2021-08-10 | 어플리케이션에 따라 ai 기반으로 영상을 재생하는 전자 장치 및 이에 의한 영상 재생 방법 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/893,754 Continuation US12223617B2 (en) | 2021-08-10 | 2022-08-23 | Electronic apparatus and method for reproducing image based on artificial intelligence according to application |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023018179A1 true WO2023018179A1 (ko) | 2023-02-16 |
Family
ID=85200110
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/KR2022/011855 Ceased WO2023018179A1 (ko) | 2021-08-10 | 2022-08-09 | 어플리케이션에 따라 ai 기반으로 영상을 재생하는 전자 장치 및 이에 의한 영상 재생 방법 |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US12223617B2 (ko) |
| EP (1) | EP4322057A4 (ko) |
| KR (1) | KR20230023460A (ko) |
| CN (1) | CN117480507A (ko) |
| WO (1) | WO2023018179A1 (ko) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2023274404A1 (en) * | 2021-07-01 | 2023-01-05 | Beijing Bytedance Network Technology Co., Ltd. | Application of super resolution |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20150088121A (ko) * | 2014-01-23 | 2015-07-31 | 세종대학교산학협력단 | Mpeg-7 서술자 처리과정에서의 이미지 필터링 방법 및 장치 |
| KR101993001B1 (ko) * | 2019-01-16 | 2019-06-25 | 영남대학교 산학협력단 | 영상 하이라이트 제작 장치 및 방법 |
| KR20190103047A (ko) * | 2018-02-27 | 2019-09-04 | 엘지전자 주식회사 | 신호 처리 장치 및 이를 구비하는 영상표시장치 |
| KR102190483B1 (ko) * | 2018-04-24 | 2020-12-11 | 주식회사 지디에프랩 | Ai 기반의 영상 압축 및 복원 시스템 |
| KR20210067783A (ko) * | 2019-11-29 | 2021-06-08 | 삼성전자주식회사 | 전자 장치, 그 제어 방법 및 시스템 |
Family Cites Families (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2002341849A (ja) | 2001-05-15 | 2002-11-29 | Funai Electric Co Ltd | 画質調整装置 |
| US7801382B2 (en) * | 2005-09-22 | 2010-09-21 | Compressus, Inc. | Method and apparatus for adjustable image compression |
| US8350867B2 (en) * | 2009-12-22 | 2013-01-08 | Ati Technologies Ulc | Image quality configuration apparatus, system and method |
| WO2012048158A1 (en) * | 2010-10-06 | 2012-04-12 | Planet Data Solutions | System and method for indexing electronic discovery data |
| US20140072242A1 (en) * | 2012-09-10 | 2014-03-13 | Hao Wei | Method for increasing image resolution |
| KR102033078B1 (ko) | 2017-10-30 | 2019-10-16 | 에스케이텔레콤 주식회사 | 화질 측정 기반 영상 처리방법 및 그 장치 |
| KR102525576B1 (ko) | 2018-10-19 | 2023-04-26 | 삼성전자주식회사 | 영상의 ai 부호화 및 ai 복호화 방법, 및 장치 |
| WO2020080765A1 (en) | 2018-10-19 | 2020-04-23 | Samsung Electronics Co., Ltd. | Apparatuses and methods for performing artificial intelligence encoding and artificial intelligence decoding on image |
| WO2020231016A1 (en) | 2019-05-16 | 2020-11-19 | Samsung Electronics Co., Ltd. | Image optimization method, apparatus, device and storage medium |
| CN111951172A (zh) | 2019-05-16 | 2020-11-17 | 北京三星通信技术研究有限公司 | 一种图像优化方法、装置、设备和存储介质 |
| KR20190119550A (ko) | 2019-10-02 | 2019-10-22 | 엘지전자 주식회사 | 영상의 해상도를 향상시키기 위한 방법 및 장치 |
| KR102863767B1 (ko) | 2019-11-21 | 2025-09-24 | 삼성전자주식회사 | 전자 장치 및 그 제어 방법 |
| US12307627B2 (en) | 2019-11-21 | 2025-05-20 | Samsung Electronics Co., Ltd. | Electronic apparatus and controlling method thereof |
| KR20210067699A (ko) | 2019-11-29 | 2021-06-08 | 삼성전자주식회사 | 전자 장치 및 그 제어 방법 |
-
2021
- 2021-08-10 KR KR1020210105605A patent/KR20230023460A/ko active Pending
-
2022
- 2022-08-09 CN CN202280040979.8A patent/CN117480507A/zh active Pending
- 2022-08-09 WO PCT/KR2022/011855 patent/WO2023018179A1/ko not_active Ceased
- 2022-08-09 EP EP22856176.7A patent/EP4322057A4/en active Pending
- 2022-08-23 US US17/893,754 patent/US12223617B2/en active Active
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20150088121A (ko) * | 2014-01-23 | 2015-07-31 | 세종대학교산학협력단 | Mpeg-7 서술자 처리과정에서의 이미지 필터링 방법 및 장치 |
| KR20190103047A (ko) * | 2018-02-27 | 2019-09-04 | 엘지전자 주식회사 | 신호 처리 장치 및 이를 구비하는 영상표시장치 |
| KR102190483B1 (ko) * | 2018-04-24 | 2020-12-11 | 주식회사 지디에프랩 | Ai 기반의 영상 압축 및 복원 시스템 |
| KR101993001B1 (ko) * | 2019-01-16 | 2019-06-25 | 영남대학교 산학협력단 | 영상 하이라이트 제작 장치 및 방법 |
| KR20210067783A (ko) * | 2019-11-29 | 2021-06-08 | 삼성전자주식회사 | 전자 장치, 그 제어 방법 및 시스템 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4322057A4 * |
Also Published As
| Publication number | Publication date |
|---|---|
| EP4322057A1 (en) | 2024-02-14 |
| KR20230023460A (ko) | 2023-02-17 |
| CN117480507A (zh) | 2024-01-30 |
| EP4322057A4 (en) | 2024-10-09 |
| US12223617B2 (en) | 2025-02-11 |
| US20240054602A1 (en) | 2024-02-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2020080665A1 (en) | Methods and apparatuses for performing artificial intelligence encoding and artificial intelligence decoding on image | |
| WO2022071647A1 (en) | Video quality assessment method and apparatus | |
| WO2021101243A1 (en) | Apparatus and method for using ai metadata related to image quality | |
| WO2020071830A1 (ko) | 히스토리 기반 움직임 정보를 이용한 영상 코딩 방법 및 그 장치 | |
| WO2015009107A1 (en) | Method and apparatus for generating 3k-resolution display image for mobile terminal screen | |
| WO2017065525A2 (ko) | 영상을 부호화 또는 복호화하는 방법 및 장치 | |
| WO2019009489A1 (ko) | 영상을 부호화/복호화 하는 방법 및 그 장치 | |
| WO2016064185A1 (ko) | 최적화 함수를 이용하여 그래프 기반 예측을 수행하는 방법 및 장치 | |
| WO2020141879A1 (ko) | 영상 코딩 시스템에서 서브 블록 기반 시간적 머지 후보를 사용하는 어파인 움직임 예측에 기반한 영상 디코딩 방법 및 장치 | |
| WO2017014585A1 (ko) | 그래프 기반 변환을 이용하여 비디오 신호를 처리하는 방법 및 장치 | |
| WO2021172834A1 (en) | Apparatus and method for performing artificial intelligence encoding and artificial intelligence decoding on image by using pre-processing | |
| WO2020076069A1 (ko) | Atmvp 후보를 기반으로 영상 코딩을 수행하는 장치 | |
| WO2020180119A1 (ko) | Cclm 예측에 기반한 영상 디코딩 방법 및 그 장치 | |
| WO2021096057A1 (ko) | 비디오 또는 영상 코딩 시스템에서의 엔트리 포인트 관련 정보에 기반한 영상 코딩 방법 | |
| WO2021034161A1 (ko) | 인트라 예측 장치 및 방법 | |
| WO2021040398A1 (ko) | 팔레트 이스케이프 코딩 기반 영상 또는 비디오 코딩 | |
| WO2022050612A1 (ko) | 포인트 클라우드 데이터 전송장치, 포인트 클라우드 데이터 전송방법, 포인트 클라우드 데이터 수신장치 및 포인트 클라우드 데이터 수신방법 | |
| WO2019009620A1 (ko) | 인트라 예측 모드 기반 영상 처리 방법 및 이를 위한 장치 | |
| WO2023018179A1 (ko) | 어플리케이션에 따라 ai 기반으로 영상을 재생하는 전자 장치 및 이에 의한 영상 재생 방법 | |
| WO2021096252A1 (en) | Image providing apparatus and image providing method thereof, and display apparatus and display method thereof | |
| WO2021091214A1 (ko) | 크로마 양자화 파라미터 오프셋 관련 정보를 코딩하는 영상 디코딩 방법 및 그 장치 | |
| WO2021101066A1 (ko) | 비디오 또는 영상 코딩 시스템에서의 엔트리 포인트 관련 정보에 기반한 영상 코딩 방법 | |
| WO2021034160A1 (ko) | 매트릭스 인트라 예측 기반 영상 코딩 장치 및 방법 | |
| WO2024063532A1 (ko) | Mrl(multi reference line)을 이용한 인트라 예측 모드에 기반한 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장하는 기록 매체 | |
| WO2023075568A1 (ko) | 부호화 구조의 채널간 참조에 기반한 피쳐 부호화/복호화 방법, 장치, 비트스트림을 저장한 기록 매체 및 비트스트림 전송 방법 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22856176 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2022856176 Country of ref document: EP |
|
| ENP | Entry into the national phase |
Ref document number: 2022856176 Country of ref document: EP Effective date: 20231108 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202280040979.8 Country of ref document: CN |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |