JP7045351B2

JP7045351B2 - Automatic image generation method and automatic image generation device using skim-pixel convolutional neural network

Info

Publication number: JP7045351B2
Application number: JP2019149658A
Authority: JP
Inventors: ユ，ヨンジュン; チョン，サンヒョク; ユ，ジェジュン; ユン，サンド; ハ，ジョンウ
Original assignee: Naver Corp
Current assignee: Naver Corp
Priority date: 2018-09-03
Filing date: 2019-08-19
Publication date: 2022-03-31
Anticipated expiration: 2039-08-19
Also published as: JP2020038639A; KR102103727B1; KR20200026435A

Description

本出願は、学習されたイメージを用いて新しいイメージを自動で生成できるイメージ自動生成方法およびイメージ自動生成装置に関し、特にピクセル畳み込みニューラルネットワーク（ＣＮＮ；ｃｏｎｖｏｌｕｔｉｏｎａｌｎｅｕｒａｌｎｅｔｗｏｒｋ）モデルを用いるイメージ自動生成方法およびイメージ自動生成装置に関する。 The present application relates to an image automatic generation method and an image automatic generation device capable of automatically generating a new image using a learned image, and in particular, an image automatic generation method and an image using a pixel convolutional neural network (CNN) model. Regarding automatic generators.

機械学習は、インターネット情報検索、テキストマイニング、音声認識、ロボット工学、サービス業などのようなほぼ全ての分野で用いられる核心技術である。最近、機械学習の一分野であるディープラーニング技術が様々な分野で脚光を浴びており、特に画像ベースのオブジェクト認識分野では、ディープラーニング技術の一種として畳み込みニューラルネットワーク（ＣＮＮ）を用いる機械学習技法が注目されている。 Machine learning is a core technology used in almost all fields such as Internet information retrieval, text mining, speech recognition, robotics, and service industries. Recently, deep learning technology, which is a field of machine learning, has been in the limelight in various fields. Especially in the field of image-based object recognition, machine learning techniques using convolutional neural networks (CNN) as a kind of deep learning technology have been introduced. Attention has been paid.

畳み込みニューラルネットワーク技術は、入力されたイメージを、計算を経て理解し、特徴を抽出して情報を得たり、新しいイメージを生成したりするなど、様々な画像処理乃至コンピュータ・ビジョン分野で活発に研究されており、人の神経系を模写して設計される人工ニューラルネットワーク技術の一種である。 Convolutional neural network technology is actively researched in various image processing and computer vision fields, such as understanding input images through calculations, extracting features to obtain information, and generating new images. It is a kind of artificial neural network technology designed by imitating the human nervous system.

本出願は、学習されたイメージを用いて新しいイメージを自動で生成できる、スキム－ピクセルＣＮＮを用いるイメージ自動生成方法およびイメージ自動生成装置を提供する。 The present application provides an automatic image generation method and an automatic image generation device using a skim-pixel CNN, which can automatically generate a new image using a learned image.

本出願は、イメージ生成に必要な演算量および演算時間を減少させることができる、スキム－ピクセルＣＮＮを用いるイメージ自動生成方法およびイメージ自動生成装置を提供する。 The present application provides an automatic image generation method and an automatic image generation device using a skim-pixel CNN, which can reduce the calculation amount and calculation time required for image generation.

本発明の一実施形態によるスキム－ピクセルＣＮＮを用いるイメージ自動生成方法は、ピクセル予測部が、イメージ内に既に生成されている既存ピクセルのピクセル値を用いて、生成しようとする複数の対象ピクセルのピクセル予測値を同時に生成する予測ステップ、信頼度推定部が、前記対象ピクセルごとに前記ピクセル予測値に対する信頼度（ｃｏｎｆｉｄｅｎｃｅ）を生成する信頼度生成ステップ、前記対象ピクセルの信頼度が設定値以上であれば、ピクセル生成部が、前記ピクセル予測値を前記対象ピクセルのピクセル値に設定するスキミングステップ、および前記対象ピクセルの信頼度が前記設定値未満であれば、前記ピクセル生成部が、ピクセルＣＮＮモデルを用いて前記対象ピクセルのピクセル推論値を生成し、前記ピクセル推論値を前記対象ピクセルのピクセル値に設定するドロー（ｄｒａｗ）ステップを含む。 In the automatic image generation method using skim-pixel CNN according to an embodiment of the present invention, the pixel predictor uses the pixel values of existing pixels already generated in the image to generate a plurality of target pixels. A prediction step that simultaneously generates pixel prediction values, a reliability generation step in which the reliability estimation unit generates confidence for the pixel prediction value for each target pixel, and the reliability of the target pixel is equal to or higher than a set value. If there is a skimming step in which the pixel generator sets the pixel predicted value to the pixel value of the target pixel, and if the reliability of the target pixel is less than the set value, the pixel generator is the pixel CNN model. Includes a draw step in which the pixel inferred value of the target pixel is generated using and the pixel inferred value is set to the pixel value of the target pixel.

本発明の一実施形態によるスキム－ピクセルＣＮＮを用いるイメージ自動生成装置は、イメージ内に既に生成されている既存ピクセルのピクセル値を用いて、生成しようとする複数の対象ピクセルのピクセル予測値を同時に生成するピクセル予測部、前記対象ピクセルごとに前記ピクセル予測値に対する信頼度を生成する信頼度推定部、および前記対象ピクセルの信頼度が設定値以上であれば、前記ピクセル予測値を前記対象ピクセルのピクセル値に設定し、前記対象ピクセルの信頼度が前記設定値未満であれば、ピクセルＣＮＮモデルを用いて前記対象ピクセルのピクセル推論値を生成し、前記ピクセル推論値を前記対象ピクセルのピクセル値に設定するピクセル生成部を含む。 The automatic image generation device using the skim-pixel CNN according to one embodiment of the present invention simultaneously obtains pixel predicted values of a plurality of target pixels to be generated by using the pixel values of existing pixels already generated in the image. A pixel prediction unit to generate, a reliability estimation unit that generates reliability for the pixel prediction value for each target pixel, and if the reliability of the target pixel is equal to or higher than a set value, the pixel prediction value is used for the target pixel. If it is set to a pixel value and the reliability of the target pixel is less than the set value, the pixel inferred value of the target pixel is generated using the pixel CNN model, and the pixel inferred value is set to the pixel value of the target pixel. Includes pixel generator to set.

本発明の他の実施形態によるスキム－ピクセルＣＮＮを用いるイメージ自動生成装置は、プロセッサ、および前記プロセッサに接続されたメモリを含み、前記メモリは、前記プロセッサにより実行されるように構成される一つ以上のモジュールを含み、前記一つ以上のモジュールは、イメージ内に既に生成されている既存ピクセルのピクセル値を用いて、生成しようとする複数の対象ピクセルのピクセル予測値を同時に生成し、前記対象ピクセルごとに前記ピクセル予測値に対する信頼度を生成し、前記対象ピクセルの信頼度が設定値以上であれば、前記ピクセル予測値を前記対象ピクセルのピクセル値に設定し、前記対象ピクセルの信頼度が前記設定値未満であれば、ピクセルＣＮＮモデルを用いて前記対象ピクセルのピクセル推論値を生成し、前記ピクセル推論値を前記対象ピクセルのピクセル値に設定する、命令を含む。 An automatic image generator using skim-pixel CNN according to another embodiment of the present invention includes a processor and a memory connected to the processor, the memory being configured to be executed by the processor. Including the above modules, the one or more modules simultaneously generate pixel prediction values of a plurality of target pixels to be generated by using the pixel values of existing pixels already generated in the image, and the target. The reliability for the pixel predicted value is generated for each pixel, and if the reliability of the target pixel is equal to or higher than the set value, the pixel predicted value is set to the pixel value of the target pixel, and the reliability of the target pixel is set. If it is less than the set value, it includes an instruction to generate a pixel inferred value of the target pixel using the pixel CNN model and set the pixel inferred value to the pixel value of the target pixel.

なお、上記の課題を解決するための手段は、本発明の特徴を全て列挙したものではない。本発明の様々な特徴ならびにそれに応じた長所および効果は、下記の具体的な実施形態を参照してより詳細に理解できるものである。 The means for solving the above problems is not a list of all the features of the present invention. The various features of the invention and the corresponding advantages and effects can be understood in more detail with reference to the specific embodiments below.

本発明の実施形態によれば、学習されたイメージを用いて新しいイメージを自動で生成することができ、また、イメージ生成に必要な演算量および演算時間を減少させることができる According to the embodiment of the present invention, a new image can be automatically generated using the learned image, and the amount of calculation and the calculation time required for image generation can be reduced.

本発明の一実施形態によるイメージ生成装置を示す概略図である。It is a schematic diagram which shows the image generation apparatus by one Embodiment of this invention. 本発明の一実施形態によるイメージ生成装置を示すブロック図である。It is a block diagram which shows the image generation apparatus by one Embodiment of this invention. 本発明の一実施形態によるイメージ生成装置を示すブロック図である。It is a block diagram which shows the image generation apparatus by one Embodiment of this invention. 本発明の一実施形態によるイメージ生成装置によるイメージ生成を示す概略図である。It is a schematic diagram which shows the image generation by the image generation apparatus by one Embodiment of this invention. 本発明の一実施形態によるイメージ生成装置によるイメージ生成を示す概略図である。It is a schematic diagram which shows the image generation by the image generation apparatus by one Embodiment of this invention. 本発明の一実施形態によるイメージ生成装置によるイメージ生成を示す概略図である。It is a schematic diagram which shows the image generation by the image generation apparatus by one Embodiment of this invention. 本発明の一実施形態によるイメージ生成装置を用いて生成されたイメージを示す図である。It is a figure which shows the image generated by using the image generation apparatus by one Embodiment of this invention. 本発明の一実施形態によるイメージ生成装置のイメージ生成速度を示すグラフである。It is a graph which shows the image generation speed of the image generation apparatus by one Embodiment of this invention. 本発明の一実施形態によるイメージ生成方法を示すフローチャートである。It is a flowchart which shows the image generation method by one Embodiment of this invention.

以下では添付図面を参照して本明細書に開示された実施形態について詳しく説明するが、図面番号に関係なく同一または類似した構成要素には同一の参照番号を付し、それに対する重複する説明を省略することにする。以下の説明で用いられる構成要素に対する接尾辞「モジュール」および「部」は、明細書の作成の容易さだけを考慮して付与または混用されるものであって、それ自体で互いに区別される意味または役割を有するものではない。すなわち、本発明で用いられる「モジュール」および「部」という用語は、ソフトウェア構成要素、または、ＦＰＧＡもしくはＡＳＩＣのようなハードウェア構成要素を意味し、「モジュール」および「部」はある役割をする。しかしながら、「モジュール」および「部」は、ソフトウェアまたはハードウェアに限定されるものではない。「モジュール」および「部」は、アドレッシング可能な記憶媒体にあるように構成されてもよく、一つ以上のプロセッサを再生させるように構成されてもよい。よって、一例として、「モジュール」および「部」は、ソフトウェア構成要素、オブジェクト指向ソフトウェア構成要素、クラス構成要素およびタスク構成要素のような構成要素、ならびに、プロセス、関数、属性、プロシージャ、サブルーチン、プログラムコードのセグメント、ドライバ、ファームウェア、マイクロコード、回路、データ、データベース、データ構造、テーブル、アレイおよび変数を含む。構成要素ならびに「モジュール」および「部」の中から提供される機能は、より少ない数の構成要素ならびに「モジュール」および「部」で結合されることもあるし、追加の構成要素ならびに「モジュール」および「部」にさらに分離されることもある。 Hereinafter, embodiments disclosed in the present specification will be described in detail with reference to the accompanying drawings, but the same or similar components are given the same reference number regardless of the drawing number, and duplicate description thereof will be given. I will omit it. The suffixes "module" and "part" to the components used in the following description are given or mixed only for the ease of writing the specification and are meant to be distinguished from each other by themselves. Or it does not have a role. That is, the terms "module" and "part" as used in the present invention mean software components or hardware components such as FPGA or ASIC, and "module" and "part" play a role. .. However, "modules" and "parts" are not limited to software or hardware. The "module" and "part" may be configured to be in an addressable storage medium or to regenerate one or more processors. Thus, as an example, "modules" and "parts" are components such as software components, object-oriented software components, class components and task components, as well as processes, functions, attributes, procedures, subroutines, programs. Includes code segments, drivers, firmware, microcodes, circuits, data, databases, data structures, tables, arrays and variables. Components and functions provided within "modules" and "parts" may be combined by a smaller number of components and "modules" and "parts", as well as additional components and "modules". And may be further separated into "parts".

また、本明細書に開示された実施形態を説明することにおいて、関連の公知技術に関する具体的な説明が本明細書に開示された実施形態の要旨をあいまいにするおそれがあると判断される場合には、その詳細な説明を省略する。また、添付された図面は、本明細書に開示された実施形態を容易に理解できるようにするためのものに過ぎず、添付された図面によって本明細書に開示された技術的思想が限定されるものではなく、本発明の思想および技術範囲に含まれる全ての変更形態、均等形態、および代替形態を含むものとして理解しなければならない。 In addition, in explaining the embodiments disclosed in the present specification, it is determined that a specific description of the related publicly known technology may obscure the gist of the embodiments disclosed in the present specification. The detailed description thereof will be omitted. Also, the accompanying drawings are merely intended to facilitate the understanding of the embodiments disclosed herein, and the accompanying drawings limit the technical ideas disclosed herein. It should be understood as including all modified, equal, and alternative forms contained within the ideas and technical scope of the invention.

図１は、本発明の一実施形態によるイメージ自動生成装置を示す概略図である。 FIG. 1 is a schematic view showing an automatic image generation device according to an embodiment of the present invention.

イメージ自動生成装置１００は、ディープラーニングなどの機械学習技法を用いて学習用イメージ（ｔ＿ｉ）を学習することができ、学習した結果に基づいて任意のイメージ（ｇ＿ｉ）を新たに生成することができる。例えば、人物写真をイメージ自動生成装置１００に学習させた後、新しい人物写真を生成することを指示すれば、イメージ自動生成装置１００は、既に学習されたイメージとは異なる新しい人物写真を生成することができる。ここで、同一の種類のイメージに含まれるピクセルの確率分布は、互いに同様に形成されることが可能であり、イメージ自動生成装置１００は、複数の学習用イメージ（ｔ＿ｉ）を学習して、同一の種類のイメージに含まれるピクセルの確率分布を得ることができる。 The automatic image generation device 100 can learn a learning image (t_i) by using a machine learning technique such as deep learning, and can newly generate an arbitrary image (g_i) based on the learning result. .. For example, if it is instructed to generate a new portrait after training the portrait photograph by the image automatic generation device 100, the image automatic generation device 100 generates a new portrait photograph different from the already learned image. Can be done. Here, the probability distributions of the pixels included in the same type of image can be formed in the same manner as each other, and the image automatic generation device 100 learns a plurality of learning images (t_i) and is the same. You can get the probability distribution of the pixels contained in the type of image.

図１に示すように、イメージ自動生成装置１００は、ｎ個の行とｍ個の列とで配列される複数のピクセルを含むイメージ（ｇ＿ｉ）を生成することができる。このとき、各々の行は、上から下へ順次生成されることが可能であり、各々の行に含まれるピクセルに対して左側から右側に進行しつつピクセル値が設定されることが可能である。ここで、イメージ自動生成装置１００が生成するイメージ（ｇ＿ｉ）は、各々のピクセルに対応するピクセル値を用いてシーケンス（ｓｅｑｕｅｎｃｅ）で表されることが可能であり、例えば、イメージＸは、Ｘ＝｛ｘ_ｉ｜ｉ＝１，・・・，ｎ×ｍ｝で表されることが可能である。 As shown in FIG. 1, the automatic image generation device 100 can generate an image (g_i) including a plurality of pixels arranged in n rows and m columns. At this time, each row can be sequentially generated from top to bottom, and the pixel value can be set while advancing from the left side to the right side with respect to the pixels included in each row. .. Here, the image (g_i) generated by the automatic image generation device 100 can be represented by a sequence using the pixel values corresponding to each pixel. For example, the image X is X =. It can be represented by {x _i | i = 1, ..., n × m}.

ここで、イメージ自動生成装置１００は、ピクセルＣＮＮモデルを用いてイメージを生成することができる。ピクセルＣＮＮモデルは、以下のような式（１）で表すことができる。 Here, the automatic image generation device 100 can generate an image using a pixel CNN model. The pixel CNN model can be expressed by the following equation (1).

ここで、Ｘ_≦ｉ＝｛ｘ_１，・・・，ｘ_ｉ｝は、生成中のイメージ内に既に生成されている既存ピクセルのピクセル値であり、Ｘ_ｊ：ｉ＝｛ｘ_ｉ＋１，・・・，ｘ_ｊ｝は、生成しようとする対象ピクセルのピクセル値であり、ｐ（Ｘ）は、ｎ×ｍ個のピクセルを含むイメージＸのピクセル値に対する確率関数に該当する。また、ｎ×ｍは、イメージ内に含まれる全てのピクセルの個数であり、ｊ＞ｉ、∀ｊ、ｉ∈［１，ｎ×ｍ］を満たす。

Here, X _{≤ i} = {x ₁ , ..., X _i } is the pixel value of the existing pixel already generated in the image being generated, and X _{j: i} = {x _{i + 1} ,,. ·, X _j } is the pixel value of the target pixel to be generated, and p (X) corresponds to a probability function for the pixel value of the image X including n × m pixels. Further, n × m is the number of all pixels included in the image, and satisfies j> i, ∀j, i ∈ [1, n × m].

このとき、ピクセルＣＮＮモデルにおいては、ｐ_θ（ｘ_ｌ｜ｘ_１，・・・，ｘ_ｌ－１）が、ｍａｓｋｅｄｃｏｎｖｏｌｕｔｉｏｎを用いてフィルタリングする方式で近似化されることが可能である。それにより、ピクセルＣＮＮモデルは、イメージ内の既存ピクセル（Ｘ_≦ｉ）のピクセル値から次のピクセル（Ｘ_ｉ＋１）のピクセル値を推論することができ、推論したピクセル推論値を用いてイメージを生成することができる。 At this time, in the pixel CNN model, p _θ (x _l | x ₁ , ..., X _l-1 ) can be approximated by a method of filtering using a masked convolution. Thereby, the pixel CNN model can infer the pixel value of the next pixel (X _{i + 1} ) from the pixel value of the existing pixel (X _{≤ i} ) in the image, and generate an image using the inferred pixel inference value. can do.

しかしながら、ピクセルＣＮＮモデルを用いる場合、直前ピクセルまでのピクセル値を知らなければ次のピクセルのピクセル値を推論することができない。したがって、生成しようとするイメージに含まれる各々のピクセルに対するピクセル推論値を全て演算しなければならず、各々のピクセル推論値を一つずつ順次演算しなければならない。すなわち、ピクセルＣＮＮモデルを用いて新しいイメージを生成することはできるが、実行しなければならない演算量が多く、イメージ生成に相対的に多くの時間が必要となる。 However, when using the pixel CNN model, the pixel value of the next pixel cannot be inferred without knowing the pixel value up to the immediately preceding pixel. Therefore, all the pixel inference values for each pixel contained in the image to be generated must be calculated, and each pixel inference value must be sequentially calculated one by one. That is, although it is possible to generate a new image using the pixel CNN model, the amount of computation that must be performed is large, and it takes a relatively large amount of time to generate the image.

それを解決するために、本発明の一実施形態によるイメージ自動生成装置１００は、スキム－ピクセルＣＮＮを用いて、イメージ生成時、相対的に重要でないピクセル領域に対しては簡単な予測モデルでピクセル値を設定することができる。すなわち、ピクセルＣＮＮモデルを用いるピクセル推論値の生成を省略（ｓｋｉｍ）することができるため、必要な演算量を減少させることができる。また、相対的に重要度の高い領域ではピクセルＣＮＮモデルを用いてピクセル推論値を直接生成するため、イメージ生成に必要な演算量を減少させて演算速度を向上させつつ、高品質のイメージを生成することができる。 In order to solve this problem, the automatic image generation device 100 according to the embodiment of the present invention uses a skim-pixel CNN to generate pixels with a simple prediction model for pixel regions that are relatively insignificant during image generation. You can set the value. That is, since the generation of the pixel inference value using the pixel CNN model can be omitted (skim), the required calculation amount can be reduced. In addition, since the pixel inference value is directly generated using the pixel CNN model in a region of relatively high importance, a high-quality image is generated while reducing the amount of calculation required for image generation and improving the calculation speed. can do.

図２は、本発明の一実施形態によるイメージ自動生成装置１００を示すブロック図である。図２を参照すると、本発明の一実施形態によるイメージ自動生成装置１００は、ピクセル予測部１１０、信頼度推定部１２０およびピクセル生成部１３０を含むことができる。 FIG. 2 is a block diagram showing an image automatic generation device 100 according to an embodiment of the present invention. Referring to FIG. 2, the automatic image generation device 100 according to the embodiment of the present invention can include a pixel prediction unit 110, a reliability estimation unit 120, and a pixel generation unit 130.

以下では、図２を参照して本発明の一実施形態によるイメージ自動生成装置について説明する。 Hereinafter, the automatic image generation device according to the embodiment of the present invention will be described with reference to FIG.

ピクセル予測部１１０は、イメージ内に既に生成されている既存ピクセルのピクセル値を用いて、生成しようとする複数の対象ピクセルのピクセル予測値を同時に生成することができる。ここで、ピクセル予測部１１０は、図４（ａ）に示すように、対象領域Ｐ１を予め設定することができ、対象領域Ｐ１に含まれる対象ピクセルのピクセル予測値を同時に生成することができる。その後、一つの行に対するピクセル値の設定が完了すれば、次の予め設定された個数（例えば、２個）の行を対象領域に設定し、該対象領域内に含まれる各々の対象ピクセルに対するピクセル予測値を生成することができる。 The pixel prediction unit 110 can simultaneously generate pixel prediction values of a plurality of target pixels to be generated by using the pixel values of existing pixels already generated in the image. Here, as shown in FIG. 4A, the pixel prediction unit 110 can set the target area P1 in advance, and can simultaneously generate the pixel prediction values of the target pixels included in the target area P1. After that, when the setting of the pixel value for one row is completed, the next preset number of rows (for example, two) is set in the target area, and the pixel for each target pixel contained in the target area is set. Predicted values can be generated.

具体的には、ｉ番目のピクセルまでピクセル値が設定された状態でｊ番目の対象ピクセルに対するピクセル予測値を生成する場合（ｊ＞ｉ、∀ｊ、ｉ∈［１，ｎ×ｍ］）、ピクセル予測部１１０は、ｉ番目のピクセルまでのピクセル値と、ｉ＋１番目の対象ピクセルからｊ－１番目の対象ピクセルまでの事前予測値と、をピクセルＣＮＮモデルに適用する方式で、ｊ番目の対象ピクセルに対するピクセル予測値を生成することができる。 Specifically, when generating a pixel prediction value for the j-th target pixel with the pixel value set up to the i-th pixel (j> i, ∀j, i ∈ [1, n × m]), The pixel prediction unit 110 applies the pixel values up to the i-th pixel and the pre-predicted values from the i + 1th target pixel to the j-1st target pixel to the pixel CNN model, and the j-th target. Pixel predictions for pixels can be generated.

既存のピクセルＣＮＮモデルを用いる場合、ｊ番目の対象ピクセルに対するピクセル推論値を演算するためには、ｊ－１番目の対象ピクセルまでのピクセル推論値を全て計算しなければならなかった。これに対し、ピクセル予測部１１０においては、ｉ＋１番目のピクセルからｊ－１番目のピクセルまでの事前予測値を適用するため、ｊ－１番目のピクセルまでのピクセル値を知らない場合にもｊ番目の対象ピクセルに対するピクセル予測値を予め生成することができる。すなわち、ピクセル予測部１１０は、既存ピクセルのピクセル値と事前予測値とをピクセルＣＮＮモデルに並列的に適用することができるため、複数の対象ピクセルに対する各々のピクセル予測値を同時に計算することができる。 When using the existing pixel CNN model, in order to calculate the pixel inference value for the jth target pixel, all the pixel inference values up to the j-1st target pixel had to be calculated. On the other hand, in the pixel prediction unit 110, since the pre-prediction value from the i + 1st pixel to the j-1st pixel is applied, the jth pixel value up to the j-1st pixel is not known. Pixel prediction values for the target pixels of can be generated in advance. That is, since the pixel prediction unit 110 can apply the pixel value of the existing pixel and the pre-prediction value in parallel to the pixel CNN model, each pixel prediction value for a plurality of target pixels can be calculated at the same time. ..

ここで、事前予測値は、Ｕ－ｎｅｔニューラルネットワークを用いて抽出されることが可能である。Ｕ－ｎｅｔニューラルネットワークは、オートエンコーダ（ａｕｔｏｅｎｃｏｄｅｒ）構造を有するものであって、１番目のピクセルからｉ番目のピクセルまでのピクセル値を用いて、ｉ＋１番目の対象ピクセルからｊ－１番目の対象ピクセルのピクセル値に対する近似値を提供することができる。このとき、事前予測値は、独立同分布（ＩＩＤ；ｉｎｄｅｐｅｎｄｅｎｔａｎｄｉｄｅｎｔｉｃａｌｌｙｄｉｓｔｒｉｂｕｔｅｄ）の特性を有するものであって、ｊ番目の対象ピクセルに対するピクセル予測値とｊ＋１番目の対象ピクセルに対するピクセル予測値とを同時に演算することができる。すなわち、各々の対象ピクセルに対する事前予測値を予め生成することができるため、ピクセル予測部１１０は、既存ピクセルのピクセル値と事前予測値とをピクセルＣＮＮモデルに並列的に適用することができ、それにより、複数の対象ピクセルに対する各々のピクセル予測値を同時に計算することができる。 Here, the pre-predicted value can be extracted using a U-net neural network. The U-net neural network has an autoencoder structure, and uses the pixel values from the first pixel to the i-th pixel to the i + 1th target pixel to the j-1st target pixel. Can provide an approximation to the pixel value of. At this time, the prior predicted value has the characteristic of independent and uniquely distributed (IID), and the pixel predicted value for the jth target pixel and the pixel predicted value for the j + 1th target pixel are simultaneously used. Can be calculated. That is, since the pre-prediction value for each target pixel can be generated in advance, the pixel prediction unit 110 can apply the pixel value and the pre-prediction value of the existing pixel to the pixel CNN model in parallel. Therefore, each pixel prediction value for a plurality of target pixels can be calculated at the same time.

具体的には、ピクセル予測部１１０は、式（１）に事前予測値を適用して以下の式（２）で表すことができる。 Specifically, the pixel prediction unit 110 can be expressed by the following formula (2) by applying a pre-prediction value to the formula (1).

ここで、ｑ（ｘ）は、事前予測値を適用したイメージＸの近似化された確率関数であり、対象ピクセルに対する各々の事前予測値であるＺ_{ｊ－１：ｉ}＝｛ｚ_ｉ＋１，・・・，ｚ_ｊ－１｝は、Ｚ_{ｊ－１：ｉ}＝ｆ_ｗ（Ｘ_≦ｉ）と定義されることが可能であり、事前予測値は、Ｘ_≦ｉにおいてＩＩＤの特性を有することができる。ここで、ｆ_ｗ（Ｘ_≦ｉ）は、Ｕ－ｎｅｔニューラルネットワークに該当することができ、ｐ_θ（ｘ_ｌ｜Ｘ_≦ｉ，ｚ_ｉ＋１，・・・，ｚ_ｌ－１）は、ピクセルＣＮＮモデルと同様に、ｍａｓｋｅｄｃｏｎｖｏｌｕｔｉｏｎを用いてフィルタリングする方式で計算されることが可能である。ここで、ｑ（ｘ）は、ピクセルＣＮＮモデルと共に複数の学習用イメージを学習して生成されることが可能である。

Here, _q ( _x ) is an approximated probability function of the image X to which the pre-predicted value is applied, and is each pre-predicted value for the target pixel. ·, Z _j-1 } can be defined as Z _{j-1: i} = f _w (X _{≦ i} ), and the pre-predicted value can have the characteristics of IID at X _{≦ i} . .. Here, f _w (X _{≤ i} ) can correspond to a U-net neural network, and p _θ (x _l | X _{≤ i} , z _{i + 1} , ..., Z _l-1 ) is a pixel CNN. Similar to the model, it can be calculated by a method of filtering using a masked convolution. Here, q (x) can be generated by learning a plurality of learning images together with the pixel CNN model.

信頼度推定部１２０は、ピクセル予測部１１０により生成されたピクセル予測値に対する信頼度を生成することができる。ピクセル予測部１１０により生成されるピクセル予測値は、ｉ＋１番目の対象ピクセルからｊ－１番目の対象ピクセルまでの事前予測値を自己回帰（ＡＲ；ａｕｔｏｒｅｇｒｅｓｓｉｖｅ）方式で演算するため、事前予測値に含まれる誤差は次第に大きくなる。すなわち、ピクセル予測部１１０が予測した値だけでイメージを生成する場合には、誤差により、所望の結果とは異なるイメージを生成するおそれがある。したがって、信頼度推定部１２０は、ピクセル予測部１１０により生成されたピクセル予測値を使用できるかに対する信頼度を計算して定量的な値として提供することができる。 The reliability estimation unit 120 can generate reliability for the pixel prediction value generated by the pixel prediction unit 110. The pixel prediction value generated by the pixel prediction unit 110 is a pre-prediction value because the pre-prediction value from the i + 1th target pixel to the j-1st target pixel is calculated by an autoregressive (AR) method. The included error will gradually increase. That is, when the pixel prediction unit 110 generates an image only with the predicted value, there is a possibility that an image different from the desired result may be generated due to an error. Therefore, the reliability estimation unit 120 can calculate the reliability for whether the pixel prediction value generated by the pixel prediction unit 110 can be used and provide it as a quantitative value.

具体的には、信頼度は、以下の式（３）を用いて計算されることが可能である。 Specifically, the reliability can be calculated using the following equation (3).

ここで、ｆ_ｋは、ｋ番目の対象ピクセルのピクセル予測値ｘ_ｋに対する信頼度であり、ピクセル推論値

Here, f _k is the reliability of the k-th target pixel with respect to the pixel predicted value x _k , and is the pixel inferred value.

は、

teeth,

を満たし、ｋ＝ｉ＋１，・・・，ｊである。すなわち、信頼度ｆ_ｋは、ピクセル予測値ｘ_ｋが、ピクセルＣＮＮモデルを用いて計算されたピクセル推論値

, And k = i + 1, ..., J. That is, the reliability f _k is a pixel inference value in which the pixel predicted value x _k is calculated using the pixel CNN model.

と同一である確率に対応する。ここで、ピクセル予測値が、ピクセルＣＮＮモデルを用いて計算されたピクセル推論値と同一である確率が高いほど、ピクセル予測値に対する信頼度は高く設定され、この確率が低いほど、ピクセル予測値に対する信頼度は低く設定される。

Corresponds to the probability of being the same as. Here, the higher the probability that the pixel predicted value is the same as the pixel inferred value calculated using the pixel CNN model, the higher the reliability for the pixel predicted value is set, and the lower this probability is, the higher the reliability for the pixel predicted value is set. The reliability is set low.

信頼度推定部１２０は、ピクセルＣＮＮモデルにより生成されたピクセル推論値とピクセル予測部１１０により生成されたピクセル予測値との差を、ディープラーニングなどの機械学習技法で学習することができ、学習されたモデルに従って信頼度を演算することができる。実施形態によっては、ピクセルＣＮＮモデルを用いてサンプルイメージを生成した後、生成したサンプルイメージに含まれる各々のピクセルのピクセル推論値とピクセル予測部１１０が生成したピクセル予測値との差を学習することができる。 The reliability estimation unit 120 can learn and learn the difference between the pixel inference value generated by the pixel CNN model and the pixel prediction value generated by the pixel prediction unit 110 by a machine learning technique such as deep learning. The reliability can be calculated according to the model. In some embodiments, after generating a sample image using the pixel CNN model, learning the difference between the pixel inference value of each pixel included in the generated sample image and the pixel prediction value generated by the pixel prediction unit 110. Can be done.

さらに、実施形態によっては、信頼度を二項分類（ｂｉｎａｒｙｃｌａｓｓｉｆｉｃａｔｉｏｎ）して表示することもできる。すなわち、信頼度が設定値（ａｔｔｅｎｔｉｏｎｔｈｒｅｓｈｏｌｄ）以上である場合には、ピクセル予測値を信頼できると判別して信頼度を１に再設定することができ、信頼度が設定値未満である場合には、ピクセル予測値を信頼できないと判別して信頼度を０に再設定することができる。 Further, depending on the embodiment, the reliability may be displayed by binary classification. That is, when the reliability is equal to or higher than the set value (attention threshold), it is possible to determine that the pixel predicted value is reliable and reset the reliability to 1, and when the reliability is less than the set value. Can determine that the pixel prediction value is unreliable and reset the reliability to 0.

ピクセル生成部１３０は、対象ピクセルに対するピクセル予測値の信頼度に応じて、対象ピクセルのピクセル値を設定することができる。すなわち、対象ピクセルの信頼度が設定値以上であれば、ピクセル予測値を対象ピクセルのピクセル値に設定し、対象ピクセルの信頼度が設定値未満であれば、ピクセルＣＮＮモデルにより生成されたピクセル推論値を対象ピクセルのピクセル値に設定することができる。 The pixel generation unit 130 can set the pixel value of the target pixel according to the reliability of the pixel prediction value with respect to the target pixel. That is, if the reliability of the target pixel is equal to or higher than the set value, the pixel prediction value is set to the pixel value of the target pixel, and if the reliability of the target pixel is less than the set value, the pixel inference generated by the pixel CNN model is performed. The value can be set to the pixel value of the target pixel.

図４に示すように、ピクセル予測部１１０および信頼度推定部１２０は、対象領域Ｐ１が設定されれば、対象領域Ｐ１に対応するピクセル予測値および信頼度を生成することができる。ここで、信頼度は、設定値を基準に、二値化画像で表示されることが可能である。すなわち、信頼度が設定値以上である場合には白色（１）で表示され、信頼度が設定値未満である場合には黒色（０）で表示されることが可能である。 As shown in FIG. 4, the pixel prediction unit 110 and the reliability estimation unit 120 can generate the pixel prediction value and the reliability corresponding to the target area P1 if the target area P1 is set. Here, the reliability can be displayed as a binarized image based on the set value. That is, when the reliability is equal to or higher than the set value, it can be displayed in white (1), and when the reliability is less than the set value, it can be displayed in black (0).

その後、図５（ａ）に示すように、ａ１領域に含まれる対象ピクセルのピクセル値から順次ピクセル値を設定することができる。ここで、図５（ａ）においては、ａ１領域に対応する信頼度が白色で表示されているため、ピクセル予測値を信頼できる場合に該当する。したがって、ａ１領域に対するピクセル値は、ピクセル予測値に応じて設定されることが可能である。 After that, as shown in FIG. 5A, the pixel value can be sequentially set from the pixel value of the target pixel included in the a1 region. Here, in FIG. 5A, since the reliability corresponding to the a1 region is displayed in white, it corresponds to the case where the pixel prediction value can be trusted. Therefore, the pixel value for the a1 region can be set according to the pixel predicted value.

また、ａ１領域の次の領域に該当するａ２領域に含まれる対象ピクセルのピクセル値は、図５（ｂ）に示すように設定されることが可能である。すなわち、ａ２領域に対応する信頼度は黒色で表示されているため、ピクセル予測値を信頼できない場合に該当する。したがって、ａ２領域に対応するピクセル予測値をａ２領域に適用しない。その代わりに、ピクセルＣＮＮモデルを用いてピクセル推論値を演算し、演算されたピクセル推論値をａ２領域のピクセル値に設定することができる。この場合、図６（ａ）に示すように、ａ２領域のピクセル値が設定されることが可能である。 Further, the pixel value of the target pixel included in the a2 region corresponding to the region next to the a1 region can be set as shown in FIG. 5 (b). That is, since the reliability corresponding to the a2 region is displayed in black, it corresponds to the case where the pixel prediction value cannot be trusted. Therefore, the pixel prediction value corresponding to the a2 area is not applied to the a2 area. Instead, the pixel inference value can be calculated using the pixel CNN model and the calculated pixel inference value can be set to the pixel value in the a2 region. In this case, as shown in FIG. 6A, it is possible to set the pixel value in the a2 region.

一方、図６（ａ）に示すように、対象ピクセルのピクセル値がピクセル推論値で設定されれば、残りの対象領域Ｐ２に対するピクセル予測値および信頼度を再び演算してアップデートすることができる。すなわち、ピクセル推論値で設定された対象ピクセルのピクセル値を反映してピクセル予測値および信頼度をアップデートすることによって、既存のピクセル予測値に含まれる誤差を除去し、より正確なピクセル予測値および信頼度を生成することができる。 On the other hand, as shown in FIG. 6A, if the pixel value of the target pixel is set by the pixel inference value, the pixel prediction value and the reliability for the remaining target area P2 can be calculated and updated again. That is, by updating the pixel predictor and reliability to reflect the pixel value of the target pixel set by the pixel inference value, the error contained in the existing pixel predictor is removed, and the pixel predictor and the more accurate pixel predictor and Confidence can be generated.

さらに、ピクセル生成部１３０は、イメージ生成のための最初のｋ個のピクセルを予め生成することができる。すなわち、最初のｋ個のピクセルに対してはピクセル予測値などの演算を実行せず、ピクセルＣＮＮモデルだけを用いてピクセル推論値を演算し、それを用いてイメージを生成することができる。場合によっては、最初のｋ個のピクセルにランダムなピクセル値を付与することもできる。例えば、イメージ生成時、最初の３個の列までは、ピクセルＣＮＮモデルを用いるピクセル推論値やランダム値でピクセル値を設定することができる。 Further, the pixel generation unit 130 can pre-generate the first k pixels for image generation. That is, it is possible to calculate the pixel inference value using only the pixel CNN model without executing the calculation such as the pixel prediction value for the first k pixels, and to generate an image using it. In some cases, a random pixel value can be given to the first k pixels. For example, at the time of image generation, pixel values can be set by pixel inference values or random values using a pixel CNN model up to the first three columns.

ピクセル生成部１３０は、イメージ生成を完了するまで上述した過程を繰り返し実行することができる。 The pixel generation unit 130 can repeatedly execute the above-mentioned process until the image generation is completed.

一方、図３に示すように、本発明の一実施形態によるイメージ自動生成装置１００は、プロセッサ１０、メモリ４０などの物理的な構成を含むことができ、メモリ４０内には、プロセッサ１０により実行されるように構成される一つ以上のモジュールが含まれることが可能である。具体的には、一つ以上のモジュールには、ピクセル予測モジュール、信頼度推定モジュールおよびピクセル生成モジュールなどが含まれることが可能である。 On the other hand, as shown in FIG. 3, the image automatic generation device 100 according to the embodiment of the present invention can include a physical configuration such as a processor 10 and a memory 40, and the memory 40 is executed by the processor 10. It is possible to include one or more modules configured to be. Specifically, one or more modules can include a pixel prediction module, a reliability estimation module, a pixel generation module, and the like.

プロセッサ１０は、様々なソフトウェアプログラムとメモリ４０に記憶された命令セットとを実行して様々な機能を実行しデータを処理する機能を実行することができる。周辺インターフェース部３０は、イメージ自動生成装置１００の入出力周辺装置をプロセッサ１０、メモリ４０に接続することができ、メモリ制御部２０は、プロセッサ１０やイメージ自動生成装置１００の構成要素がメモリ４０にアクセスする場合に、メモリアクセスを制御する機能を実行することができる。実施形態によっては、プロセッサ１０、メモリ制御部２０および周辺インターフェース部３０が、単一チップに実現されることもあるし、別個のチップに実現されることもある。 The processor 10 can execute various software programs and an instruction set stored in the memory 40 to execute various functions and execute a function of processing data. The peripheral interface unit 30 can connect the input / output peripheral device of the image automatic generation device 100 to the processor 10 and the memory 40, and in the memory control unit 20, the components of the processor 10 and the image automatic generation device 100 are connected to the memory 40. When accessing, it is possible to execute a function to control memory access. Depending on the embodiment, the processor 10, the memory control unit 20, and the peripheral interface unit 30 may be realized on a single chip or may be realized on separate chips.

メモリ４０は、高速ランダムアクセスメモリ、一つ以上の磁気ディスクストレージ、フラッシュメモリ装置のような不揮発性メモリなどを含むことができる。また、メモリ４０は、プロセッサ１０から離れて位置するストレージや、インターネットなどの通信ネットワークを介してアクセスされるネットワーク接続ストレージなどをさらに含むことができる。 The memory 40 can include a high-speed random access memory, one or more magnetic disk storages, a non-volatile memory such as a flash memory device, and the like. Further, the memory 40 can further include a storage located away from the processor 10, a network connection storage accessed via a communication network such as the Internet, and the like.

ディスプレイ部５０は、ユーザが、視覚を通じて、生成されたイメージを確認できるように表示する構成である。例えば、ディスプレイ部５０は、液晶ディスプレイ、薄膜トランジスタ液晶ディスプレイ、有機発光ダイオード、フレキシブルディスプレイ、３次元（３Ｄ）ディスプレイ、電気泳動ディスプレイなどを用いて視覚的に表示することができる。しかしながら、本発明の内容はこれに限定されるものではなく、この他にも様々な方式でディスプレイ部を実現することができる。 The display unit 50 is configured to display the generated image so that the user can visually confirm it. For example, the display unit 50 can be visually displayed using a liquid crystal display, a thin film transistor liquid crystal display, an organic light emitting diode, a flexible display, a three-dimensional (3D) display, an electrophoresis display, or the like. However, the content of the present invention is not limited to this, and the display unit can be realized by various methods other than this.

入力部６０は、ユーザから入力を受けるものであって、キーボード、キーパッド、マウス、タッチペン、タッチパッド、タッチパネル、ジョグホイール、ジョグスイッチなどが入力部６０に該当することができる。 The input unit 60 receives input from the user, and a keyboard, keypad, mouse, touch pen, touch pad, touch panel, jog wheel, jog switch, and the like can correspond to the input unit 60.

一方、図３に示すように、本発明の一実施形態によるイメージ自動生成装置１００は、メモリ４０に、オペレーティングシステムをはじめとして、アプリケーションプログラムに該当するピクセル予測モジュール、信頼度推定モジュールおよびピクセル生成モジュールなどを含むことができる。ここで、各々のモジュールは、上述した機能を実行するための命令セットであって、メモリ４０に記憶されることが可能である。 On the other hand, as shown in FIG. 3, the image automatic generation device 100 according to the embodiment of the present invention has a memory 40, a pixel prediction module, a reliability estimation module, and a pixel generation module corresponding to an application program, including an operating system. And so on. Here, each module is an instruction set for executing the above-mentioned functions and can be stored in the memory 40.

したがって、本発明の一実施形態によるイメージ自動生成装置１００は、プロセッサ１０がメモリ４０にアクセスして各々のモジュールに対応する命令を実行することができる。ピクセル予測モジュール、信頼度推定モジュールおよびピクセル生成モジュールは、上述したピクセル予測部、信頼度推定部およびピクセル生成部にそれぞれ対応するため、ここでは詳しい説明を省略する。 Therefore, in the image automatic generation device 100 according to the embodiment of the present invention, the processor 10 can access the memory 40 and execute the instruction corresponding to each module. Since the pixel prediction module, the reliability estimation module, and the pixel generation module correspond to the pixel prediction unit, the reliability estimation unit, and the pixel generation unit described above, respectively, detailed description thereof will be omitted here.

図７は、本発明の一実施形態によるスキム－ピクセルＣＮＮを用いて生成されたイメージを示す例示である。図７において、左側列は、ピクセル予測値を適用した領域を示したもの（白色）であり、中央列は、イメージの信頼度を示すものであって、信頼度が高いほど赤色で表示し、信頼度が低いほど青色で表示したものである。最後の右側列は、実際に生成されたイメージに該当する。また、図７（ａ）は、イメージ全体が、ピクセル予測値を用いて生成されたものであり、図７（ｆ）は、イメージ全体が、ピクセルＣＮＮモデルを用いてピクセル推論値で生成されたものであり、図７（ａ）から図７（ｆ）に行くほど、ピクセルＣＮＮモデルを適用するための設定値が高く設定されたものである。図７を参照すると、人物イメージの場合、耳目口鼻などの人物の特徴になる部分に対する信頼度が相対的に低く設定されることを確認することができ、ピクセルＣＮＮモデルだけを用いて生成する場合と比較すると、イメージ品質の差が大きく出ないことを確認することができる。また、図８に示すように、ピクセル予測値を用いる比率が高いほど、イメージ生成速度が速くなることを確認することができる。 FIG. 7 is an example showing an image generated using a skim-pixel CNN according to an embodiment of the present invention. In FIG. 7, the left column shows the area to which the pixel prediction value is applied (white), and the center column shows the reliability of the image, and the higher the reliability, the more red it is displayed. The lower the reliability, the more blue it is displayed. The last right column corresponds to the actually generated image. Further, in FIG. 7 (a), the entire image was generated using the pixel predicted value, and in FIG. 7 (f), the entire image was generated using the pixel inference value using the pixel CNN model. From FIG. 7A to FIG. 7F, the setting value for applying the pixel CNN model is set higher. With reference to FIG. 7, in the case of a human image, it can be confirmed that the reliability for the characteristic parts of the person such as ears, eyes, mouth and nose is set relatively low, and it is generated using only the pixel CNN model. Compared with the case, it can be confirmed that the difference in image quality does not appear significantly. Further, as shown in FIG. 8, it can be confirmed that the higher the ratio using the pixel prediction value, the faster the image generation speed.

図９は、本発明の一実施形態によるイメージ自動生成方法を示すフローチャートである。図９を参照すると、本発明の一実施形態によるイメージ自動生成方法は、初期生成ステップ（Ｓ１０）、予測ステップ（Ｓ２０）、信頼度生成ステップ（Ｓ３０）、スキミングステップ（Ｓ４０）、ドローステップ（Ｓ５０）およびアップデートステップ（Ｓ６０）を含むことができる。 FIG. 9 is a flowchart showing an image automatic generation method according to an embodiment of the present invention. Referring to FIG. 9, the automatic image generation method according to the embodiment of the present invention includes an initial generation step (S10), a prediction step (S20), a reliability generation step (S30), a skimming step (S40), and a draw step (S50). ) And the update step (S60).

以下では、図９を参照して本発明の一実施形態によるイメージ自動生成方法について説明する。 Hereinafter, a method for automatically generating an image according to an embodiment of the present invention will be described with reference to FIG.

初期生成ステップ（Ｓ１０）においては、ピクセル生成部が、ピクセルＣＮＮモデルから抽出したピクセル推論値を用いて、生成しようとするイメージに対する最初のｋ個のピクセルを生成することができる。すなわち、最初のｋ個のピクセルに対してはピクセル予測値などの演算を実行せず、ピクセルＣＮＮモデルだけを用いてピクセル推論値を演算し、それを用いてイメージを生成することができる。場合によっては、最初のｋ個のピクセルにランダムなピクセル値を付与することもできる。例えば、イメージ生成時、最初の３個の列までは、ピクセルＣＮＮモデルを用いるピクセル推論値やランダム値でピクセル値を設定することができる。 In the initial generation step (S10), the pixel generation unit can generate the first k pixels for the image to be generated by using the pixel inference value extracted from the pixel CNN model. That is, it is possible to calculate the pixel inference value using only the pixel CNN model without executing the calculation such as the pixel prediction value for the first k pixels, and to generate an image using it. In some cases, a random pixel value can be given to the first k pixels. For example, at the time of image generation, pixel values can be set by pixel inference values or random values using a pixel CNN model up to the first three columns.

予測ステップ（Ｓ２０）においては、ピクセル予測部が、イメージ内に既に生成されている既存ピクセルのピクセル値を用いて、生成しようとする複数の対象ピクセルのピクセル予測値を同時に生成することができる。ここで、ピクセル予測部は、対象領域を予め設定することができ、対象領域に含まれる対象ピクセルのピクセル予測値を同時に生成することができる。その後、一つの行に対するピクセル値の設定が完了すれば、次の予め設定された個数（例えば、２個）の行を対象領域に設定し、該対象領域内に含まれる各々の対象ピクセルに対するピクセル予測値を生成することができる。 In the prediction step (S20), the pixel prediction unit can simultaneously generate pixel prediction values of a plurality of target pixels to be generated by using the pixel values of existing pixels already generated in the image. Here, the pixel prediction unit can set the target area in advance, and can simultaneously generate the pixel prediction value of the target pixel included in the target area. After that, when the setting of the pixel value for one row is completed, the next preset number of rows (for example, two) is set in the target area, and the pixel for each target pixel contained in the target area is set. Predicted values can be generated.

具体的には、ｉ番目のピクセルまでピクセル値が設定された状態でｊ番目の対象ピクセルに対するピクセル予測値を生成する場合（ｊ＞ｉ、∀ｊ、ｉ∈［１，ｎ×ｍ］）、ピクセル予測部は、ｉ番目のピクセルまでのピクセル値と、ｉ＋１番目の対象ピクセルからｊ－１番目の対象ピクセルまでの事前予測値と、をピクセルＣＮＮモデルに適用する方式で、ｊ番目の対象ピクセルに対するピクセル予測値を生成することができる。ここで、事前予測値は、Ｕ－ｎｅｔニューラルネットワークを用いて抽出されることが可能であり、事前予測値は、ＩＩＤの特性を有することができる。すなわち、各々の対象ピクセルに対する事前予測値を予め生成することができるため、ピクセル予測部は、既存ピクセルのピクセル値と事前予測値とをピクセルＣＮＮモデルに並列的に適用することができ、それにより、複数の対象ピクセルに対する各々のピクセル予測値を同時に計算することができる。 Specifically, when generating a pixel prediction value for the j-th target pixel with the pixel value set up to the i-th pixel (j> i, ∀j, i ∈ [1, n × m]), The pixel prediction unit applies the pixel values up to the i-th pixel and the pre-predicted values from the i + 1th target pixel to the j-1st target pixel to the pixel CNN model, and the j-th target pixel. Can generate pixel predictions for. Here, the pre-predicted value can be extracted using a U-net neural network, and the pre-predicted value can have the characteristics of IID. That is, since the pre-predicted value for each target pixel can be pre-generated, the pixel predictor can apply the pixel value and the pre-predicted value of the existing pixel in parallel to the pixel CNN model, thereby. , Each pixel prediction value for a plurality of target pixels can be calculated at the same time.

信頼度生成ステップ（Ｓ３０）においては、信頼度推定部が、対象ピクセルごとにピクセル予測値に対する信頼度を生成することができる。すなわち、信頼度推定部は、ピクセル予測部により生成されたピクセル予測値を使用できるかに対する信頼度を計算して定量的な値として提供することができる。ここで、信頼度は、対象ピクセルに対するピクセル予測値が対象ピクセルに対するピクセル推論値と一致する確率であり、信頼度推定部は、ピクセル推論値とピクセル予測値との差を、ディープラーニングなどの機械学習技法で学習して信頼度を演算することができる。具体的には、ピクセルＣＮＮモデルを用いてサンプルイメージを生成した後、生成したサンプルイメージに含まれる各々のピクセルのピクセル推論値とピクセル予測部が生成したピクセル予測値とを比較する方式で学習することができる。 In the reliability generation step (S30), the reliability estimation unit can generate the reliability for the pixel predicted value for each target pixel. That is, the reliability estimation unit can calculate the reliability for whether the pixel prediction value generated by the pixel prediction unit can be used and provide it as a quantitative value. Here, the reliability is the probability that the pixel predicted value for the target pixel matches the pixel inferred value for the target pixel, and the reliability estimation unit determines the difference between the pixel inferred value and the pixel predicted value by a machine such as deep learning. You can learn with learning techniques and calculate reliability. Specifically, after generating a sample image using the pixel CNN model, learning is performed by comparing the pixel inference value of each pixel included in the generated sample image with the pixel prediction value generated by the pixel prediction unit. be able to.

スキミングステップ（Ｓ４０）においては、対象ピクセルの信頼度を設定値と比較することができ、信頼度が設定値以上であれば、ピクセル生成部が、ピクセル予測値を対象ピクセルのピクセル値に設定することができる。すなわち、対象領域に含まれる各々の対象ピクセルに対して信頼度を順次判別することができ、信頼度が設定値以上である対象ピクセルに対してはピクセル予測値でピクセル値を設定することができる。この場合、ピクセルＣＮＮモデルを用いてピクセル推論値を計算しないため、速やかにピクセル値を設定することができる。一方、対象ピクセルの信頼度が設定値未満であれば、ドローステップ（Ｓ５０）に進む。 In the skimming step (S40), the reliability of the target pixel can be compared with the set value, and if the reliability is equal to or higher than the set value, the pixel generator sets the pixel prediction value to the pixel value of the target pixel. be able to. That is, the reliability can be sequentially determined for each target pixel included in the target area, and the pixel value can be set by the pixel predicted value for the target pixel whose reliability is equal to or higher than the set value. .. In this case, since the pixel inference value is not calculated using the pixel CNN model, the pixel value can be set quickly. On the other hand, if the reliability of the target pixel is less than the set value, the process proceeds to the draw step (S50).

ドローステップ（Ｓ５０）においては、対象ピクセルの信頼度が設定値未満であれば、ピクセル生成部が、ピクセルＣＮＮモデルを用いて対象ピクセルのピクセル推論値を生成し、ピクセル推論値を対象ピクセルのピクセル値に設定することができる。すなわち、対象ピクセルのピクセル予測値に対する信頼度が低いため、ピクセル予測値の代わりにピクセルＣＮＮモデルを用いるピクセル推論値でピクセル値を設定することができる。ここで、ピクセルＣＮＮモデルを用いて対象ピクセルのピクセル推論値を演算する場合には、演算時間が相対的に長くなるが、より正確なイメージを生成することができる。 In the draw step (S50), if the reliability of the target pixel is less than the set value, the pixel generator generates the pixel inference value of the target pixel using the pixel CNN model, and the pixel inference value is the pixel of the target pixel. Can be set to a value. That is, since the reliability of the target pixel with respect to the pixel predicted value is low, the pixel value can be set by the pixel inference value using the pixel CNN model instead of the pixel predicted value. Here, when the pixel inference value of the target pixel is calculated using the pixel CNN model, the calculation time is relatively long, but a more accurate image can be generated.

一方、ピクセル推論値で対象ピクセルのピクセル値を設定した後には、残りの対象領域に対するピクセル予測値および信頼度を再び演算してアップデートするアップデートステップ（Ｓ６０）を実行することができる。すなわち、ピクセル推論値で設定された対象ピクセルのピクセル値を反映してピクセル予測値および信頼度をアップデートすることによって、既存のピクセル予測値に含まれる誤差を除去し、より正確なピクセル予測値および信頼度を生成することができる。 On the other hand, after setting the pixel value of the target pixel with the pixel inference value, the update step (S60) for recalculating and updating the pixel prediction value and the reliability for the remaining target area can be executed. That is, by updating the pixel predictor and reliability to reflect the pixel value of the target pixel set by the pixel inference value, the error contained in the existing pixel predictor is removed, and the pixel predictor and the more accurate pixel predictor and Confidence can be generated.

その後、スキミングステップ（Ｓ４０）、ドローステップ（Ｓ５０）およびアップデートステップ（Ｓ６０）を、イメージ生成を完了するまで繰り返して、イメージを生成することができる。 After that, the skimming step (S40), the draw step (S50), and the update step (S60) can be repeated until the image generation is completed to generate an image.

本発明の実施形態によるスキム－ピクセルＣＮＮを用いるイメージ自動生成方法およびイメージ自動生成装置によれば、イメージ生成時、相対的に重要でないピクセル領域に対しては簡単な予測モデルでピクセル値を設定することができる。すなわち、ピクセルＣＮＮモデルを用いるピクセル推論値の生成を省略することができるため、必要な演算量を減少させることができる。また、相対的に重要度の高い領域ではピクセルＣＮＮモデルを用いてピクセル推論値を直接生成するため、イメージ生成に必要な演算量を減少させて演算速度を向上させつつ、高品質のイメージを生成することができる。 According to the automatic image generation method and the automatic image generation device using the skim-pixel CNN according to the embodiment of the present invention, pixel values are set by a simple predictive model for pixel regions that are relatively insignificant at the time of image generation. be able to. That is, since the generation of the pixel inference value using the pixel CNN model can be omitted, the required calculation amount can be reduced. In addition, since the pixel inference value is directly generated using the pixel CNN model in a region of relatively high importance, a high-quality image is generated while reducing the amount of calculation required for image generation and improving the calculation speed. can do.

しかしながら、本発明の実施形態によるスキム－ピクセルＣＮＮを用いるイメージ自動生成方法およびイメージ自動生成装置が達成できる効果は、上記で言及したものに限定されず、言及していない他の効果は、上記の記載から当業者に明らかに理解できるものである。 However, the effects that can be achieved by the automatic image generation method and the automatic image generation apparatus using the skim-pixel CNN according to the embodiment of the present invention are not limited to those mentioned above, and other effects not mentioned above are described above. It is clearly understandable to those skilled in the art from the description.

前述した本発明の実施形態は、コンピュータ読み取り可能なコードであるプログラムとして実現されることが可能であり、また、プログラムが記憶されたコンピュータ読み取り可能な記憶媒体として実現されることが可能である。コンピュータ読み取り可能な記憶媒体は、コンピュータによって実行可能なプログラムを続けて記憶するものであってもよいし、実行またはダウンロードのために一時的に記憶するものであってもよい。また、媒体は、単一のまたは複数のハードウェアが結合された形態の様々な記憶手段または格納手段であってもよく、あるコンピュータシステムに直接接続される媒体に限定されず、ネットワーク上に分散して存在するものであってもよい。媒体の例としては、ハードディスク、フロッピー（登録商標）ディスクおよび磁気テープのような磁気媒体、ＣＤ－ＲＯＭおよびＤＶＤのような光媒体、フロプティカルディスク（ｆｌｏｐｔｉｃａｌｄｉｓｋ）のような光磁気媒体、およびＲＯＭ、ＲＡＭ、フラッシュメモリなどを含め、プログラム命令が記憶されるように構成されたものがある。また、他の媒体の例として、アプリケーションを配布するアプリケーションストアやその他の様々なソフトウェアを供給または配布するサイト、サーバなどが管理する記憶媒体または格納媒体も挙げられる。したがって、上記の詳細な説明は、全ての面で限定的に解釈されてはならず、例示的なものとみなされなければならない。本発明の範囲は、添付された請求項の合理的な解釈によって決定されなければならず、本発明の均等な範囲内での全ての変更は、本発明の範囲に含まれる。 The embodiment of the present invention described above can be realized as a program which is a computer-readable code, and can be realized as a computer-readable storage medium in which the program is stored. The computer-readable storage medium may be one that continuously stores a program that can be executed by a computer, or one that temporarily stores a program for execution or download. Also, the medium may be various storage or storage means in the form of a single piece or a combination of multiple pieces of hardware, not limited to a medium directly connected to a computer system, and distributed over a network. And may exist. Examples of media include hard disks, magnetic media such as floppy (registered trademark) disks and magnetic tapes, optical media such as CD-ROMs and DVDs, optical magnetic media such as floptic discs, and Some are configured to store program instructions, including ROMs, RAMs, flash memories, and the like. Examples of other media include storage or storage media managed by application stores that distribute applications, sites that supply or distribute various other software, servers, and the like. Therefore, the above detailed description should not be construed in a limited way in all respects and should be regarded as exemplary. The scope of the invention must be determined by the reasonable interpretation of the appended claims, and all modifications within the equal scope of the invention are within the scope of the invention.

本発明は、前述した実施形態および添付された図面によって限定されるものではない。当業者であれば、本発明の技術的思想を逸脱しない範囲内で本発明に係る構成要素を置換、変形および変更できることは明らかである。 The present invention is not limited to the embodiments described above and the accompanying drawings. It will be apparent to those skilled in the art that the components of the invention can be replaced, modified and modified without departing from the technical ideas of the invention.

１００・・・イメージ自動生成装置
１１０・・・ピクセル予測部
１２０・・・信頼度推定部
１３０・・・ピクセル生成部 100 ... Automatic image generation device 110 ... Pixel prediction unit 120 ... Reliability estimation unit 130 ... Pixel generation unit

Claims

It is an automatic image generation method.
A prediction step in which the pixel predictor simultaneously generates pixel prediction values for multiple target pixels to be generated, using the pixel values of existing pixels that have already been generated in the image.
A step in which the reliability estimation unit generates reliability for a pixel inferred value generated by using the pixel CNN model of the pixel predicted value for each target pixel.
If the reliability of the target pixel is equal to or higher than the set value, the pixel generation unit sets the pixel predicted value to the pixel value of the target pixel, and if the reliability of the target pixel is less than the set value. , An automatic image generation method including a step in which the pixel generation unit sets the pixel inference value to the pixel value of the target pixel.

The image is generated to contain multiple pixels arranged in n rows and m columns.
The automatic image generation method according to claim 1, wherein each row is sequentially generated from top to bottom, and the pixel values of the pixels included in the row are generated while proceeding from the left side to the right side.

The prediction step is
The automatic image generation method according to claim 1, wherein when the setting of the pixel value for any one row is completed, the pixel predicted value for the target area including the next preset number of rows is simultaneously generated.

If the pixel value of the target pixel is set by the pixel inference value, the step of updating the pixel prediction value and reliability for the remaining target area after the target pixel to reflect the pixel value of the target pixel. The automatic image generation method according to claim 3, further comprising.

The prediction step is
When generating a pixel prediction value for the j-th target pixel with the pixel value set up to the i-th pixel (j> i, ∀j, i ∈ [1, n × m]), the i-th pixel The pixel values up to and the pre-predicted values from the i + 1th target pixel to the j-1th target pixel are applied to the pixel CNN model to generate the pixel predicted values for the j-th target pixel. The automatic image generation method according to claim 2.

The prediction step is
The automatic image generation method according to claim 5, wherein a pre-predicted value from the i + 1th target pixel to the j-1th target pixel is extracted using a U-net neural network.

The prediction step is
The automatic image generation method according to claim 6, wherein the pixel value of the existing pixel and the pre-predicted value are applied in parallel to the pixel CNN model to simultaneously calculate the pixel predicted value for a plurality of target pixels.

The reliability estimation unit is
Using the sample image generated using the pixel CNN model, the difference between the pixel inference value of the pixels included in the sample image and the pixel prediction value generated by the pixel prediction unit is learned to generate the reliability. The image automatic generation method according to claim 1.

The reliability estimation unit is
The automatic image generation method according to claim 8, wherein the probability that the pixel prediction value for the target pixel matches the pixel inference value for the target pixel is calculated and provided as the reliability for the target pixel.

The automatic image generation method according to claim 1, further comprising a step in which the pixel generation unit generates the first k pixels of the image using the pixel inference value extracted from the pixel CNN model.

A computer program that causes a computer to execute the automatic image generation method according to any one of claims 1 to 10 .

It is an automatic image generator,
Pixel prediction unit that simultaneously generates pixel prediction values of multiple target pixels to be generated using the pixel values of existing pixels already generated in the image.
The reliability estimation unit that generates the reliability for the pixel inference value generated by using the pixel CNN model of the pixel prediction value for each target pixel, and the pixel prediction if the reliability of the target pixel is equal to or higher than the set value. An automatic image generator including a pixel generator that sets a value to the pixel value of the target pixel and sets the pixel inference value to the pixel value of the target pixel if the reliability of the target pixel is less than the set value. ..

It is an automatic image generator,
Includes the processor and the memory attached to the processor
The memory comprises one or more modules configured to be executed by the processor.
The one or more modules mentioned above
Using the pixel values of existing pixels already generated in the image, the pixel prediction values of multiple target pixels to be generated are generated at the same time.
For each target pixel, the reliability for the pixel inference value generated by using the pixel CNN model of the pixel prediction value is generated.
If the reliability of the target pixel is equal to or higher than the set value, the pixel predicted value is set to the pixel value of the target pixel.
If the reliability of the target pixel is less than the set value, the pixel inference value is set to the pixel value of the target pixel.
Automatic image generator, including instructions.