[go: up one dir, main page]

WO2022208843A1 - Training data processing device, training data processing method, and training data processing program - Google Patents

Training data processing device, training data processing method, and training data processing program Download PDF

Info

Publication number
WO2022208843A1
WO2022208843A1 PCT/JP2021/014159 JP2021014159W WO2022208843A1 WO 2022208843 A1 WO2022208843 A1 WO 2022208843A1 JP 2021014159 W JP2021014159 W JP 2021014159W WO 2022208843 A1 WO2022208843 A1 WO 2022208843A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
data
learning
learning data
brightness
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2021/014159
Other languages
French (fr)
Japanese (ja)
Inventor
翔大 山田
弘員 柿沼
秀信 長田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NTT Inc
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Priority to JP2023510112A priority Critical patent/JPWO2022208843A1/ja
Priority to PCT/JP2021/014159 priority patent/WO2022208843A1/en
Publication of WO2022208843A1 publication Critical patent/WO2022208843A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • Embodiments of the present invention relate to a learning data processing device, a learning data processing method, and a learning data processing program.
  • Relighting is a technique for generating a relighted image by changing the lighting environment in the image to the desired one for the input image. This relighting technique uses deep learning to generate the desired relighted image from the input image.
  • Non-Patent Document 1 a re-illuminated image is obtained using learning data in which an input image and a teacher image obtained by changing only the lighting environment from the input image are paired. Train a deep generative model that generates
  • Non-Patent Document 2 prepares a special facility surrounded by a large number of cameras and lighting when creating such learning data in a real environment, and prepares various shooting conditions and lighting conditions. I suggest shooting with
  • the input image used for learning is an image that does not have global shadows on the face region caused by the lighting environment being blocked by buildings or trees.
  • a shadow removal method is proposed.
  • Non-Patent Document 1 when a deep generative model that generates a re-illuminated image as in Non-Patent Document 1 is learned using the learning data created in Non-Patent Document 2, the number of lighting environment patterns in the learning data is reduced. . Therefore, after learning the deep generation model, when generating a relight image from an image with shadows or highlights that are not included in the training data as an input image, shadows or highlights are added to the generated relight image. It leaves highlights.
  • Non-Patent Document 3 In order to remove the shadows in the re-illumination image, shadow removal processing like that of Non-Patent Document 3 is further required.
  • the present invention seeks to provide a technique that makes it possible to implement the learning of deep generative models that are robust to shadows or highlights.
  • a learning data processing device includes a data input section, a data extension section, and a data output section.
  • the data input unit inputs an input image, an illumination environment of the input image, a teacher image that is an image obtained by changing only the illumination environment from the input image, an illumination environment of the teacher image, and a brightness change target area in the input image.
  • Learning data used for learning the deep generative model is acquired, including the target region image shown in FIG.
  • a data extension unit creates a brightness-adjusted image obtained by performing brightness adjustment on the input image, creates a mask image indicating a brightness-changed region to which brightness is to be changed, and extracts the target region image, the brightness-adjusted image, and the brightness-adjusted image. Synthesize the mask image to create a data augmented image.
  • the data output unit creates new learning data by changing the input image in the learning data to the data augmentation image, and uses the new learning data as the learning data used for learning the deep generative model. Output.
  • FIG. 1 is a block diagram showing an example of the configuration of a deep generative model learning system comprising a learning data processing device according to the first embodiment of the present invention.
  • FIG. 2 is a diagram showing an example of a mask image data set possessed by the learning data processing device.
  • FIG. 3 is a diagram illustrating an example of a hardware configuration of a learning data processing device;
  • FIG. 4 is a flow chart showing an example of the processing operation of the learning data processing device.
  • FIG. 5 is a diagram showing an example of an input image that is one of learning data.
  • FIG. 6 is a diagram showing an example of a target region image, which is one of learning data.
  • FIG. 7 is a diagram showing an example of a shadow/highlight image created by the learning data processing device during processing.
  • FIG. 1 is a block diagram showing an example of the configuration of a deep generative model learning system comprising a learning data processing device according to the first embodiment of the present invention.
  • FIG. 2 is a diagram showing an example of
  • FIG. 8 is a diagram showing another example of a shadow/highlight image.
  • FIG. 9 is a diagram showing an example of a mask image.
  • FIG. 10 is a diagram showing another example of the mask image.
  • FIG. 11 is a diagram showing an example of a reverse shadow/highlight imparting area image to be combined.
  • FIG. 12 is a diagram showing another example of a reverse shadow/highlight imparting area image to be combined.
  • FIG. 13 is a diagram illustrating an example of a data augmented image created by the learning data processing device;
  • FIG. 14 is a diagram showing another example of a data augmented image created by the learning data processing device.
  • FIG. 15 is a flow chart showing an example of the processing operation of the learning data processing device according to the second embodiment of the present invention.
  • FIG. 15 is a flow chart showing an example of the processing operation of the learning data processing device according to the second embodiment of the present invention.
  • FIG. 16 is a block diagram showing an example of the configuration of a deep generative model learning system including a learning data processing device according to the third embodiment of the present invention.
  • FIG. 17 is a flow chart showing an example of the processing operation of the learning data processing device according to the third embodiment.
  • FIG. 1 is a block diagram showing an example of the configuration of a deep generative model learning system including a learning data processing device 100 according to the first embodiment of the present invention.
  • the deep generative model learning system includes this learning data processing device 100 , a learning device 200 and a learning data storage unit 300 .
  • the deep generative model learning system may be configured such that each of these units is integrated as one device or housing, or may be configured from a plurality of devices. Also, multiple devices may be remotely located and connected via a network.
  • the learning data storage unit 300 stores learning data necessary for learning in the learning device 200.
  • the learning data includes the input image and the lighting environment of the input image, the teacher image that is an image obtained by changing only the lighting environment from the input image, the lighting environment of the teacher image, and the brightness change target area, that is, the shadow or highlight in the input image. and a target area image showing the area to be applied (eg, a portion of a person's face).
  • the lighting environment has, for example, either vector data using spherical harmonics or an environment map image expressing reflections around the image.
  • One epoch is when all of the prepared learning data is transferred from the learning data storage unit 300 to the learning data processing device 100 once.
  • the learning data processing device 100 performs data preprocessing including data extension on the learning data acquired from the learning data storage unit 300 .
  • Data augmentation refers to a process of adding an image effect that simulates shadows or highlights to learning data.
  • the learning data processing device 100 passes the preprocessed learning data to the learning device 200 .
  • the learning device 200 uses the learning data passed from the learning data processing device 100 to learn the deep generative model.
  • the learning device 200 uses the learned deep generation model to generate a re-illuminated image from an arbitrary input image.
  • the input image may be acquired via the learning data processing device 100, or may be acquired via an input device or a network (not shown).
  • the learning device 200 updates the parameters of the deep generative model and records the deep generative model by evaluating the generated re-illumination image and the learning data.
  • the learning data processing device 100 includes a data input section 110, a data extension section 120 and a data output section .
  • the data input unit 110 acquires the learning data, that is, the input image and the lighting environment of the input image, the teacher image and the lighting environment of the teacher image, and the target area image from the learning data storage unit 300 .
  • the data input unit 110 passes the input image and the lighting environment of the input image and the teacher image and the lighting environment of the teacher image among the learning data to the data output unit 130 .
  • the data input unit 110 uses random parameters to determine whether to perform data extension that increases the influence of illumination. Hand over to 120.
  • the data extension unit 120 includes a luminance adjustment unit 121, a mask area creation unit 122, a mask image storage unit 123, and an image synthesis unit 124.
  • the brightness adjustment unit 121 passes the input image passed from the data input unit 110 to the image synthesizing unit 124 . Also, the luminance adjustment unit 121 creates a shadow/highlight image, which is a luminance-adjusted image obtained by performing luminance adjustment on the input image. Then, the brightness adjustment unit 121 passes the created shadow/highlight image and the target region image passed from the data input unit 110 to the mask region generation unit 122 .
  • the mask image storage unit 123 stores a mask image data set, which is a data set of irregular mask images.
  • FIG. 2 is a diagram showing an example of this mask image data set.
  • An existing data set may be used for the mask image, or an image created using Perlin noise may be used.
  • the mask area creation unit 122 creates a shadow/highlight application area image indicating an area to be shaded or highlighted from the shadow/highlight image and the target area image passed from the brightness adjustment unit 121 and the mask image. do.
  • the mask image may be a mask image stored in advance in the mask image storage unit 123, or may be created by subjecting the shadow/highlight image to arbitrary binarization processing.
  • the mask area creation unit 122 can determine which of the pre-stored mask image and the created mask image is to be used using a random parameter.
  • the mask area creating unit 122 passes the shadow/highlight image and the created shadow/highlight adding area image to the image synthesizing unit 124 .
  • the image synthesizing unit 124 synthesizes the input image passed from the luminance adjusting unit 121, the shadow/highlight image and the shadow/highlight added area image passed from the mask area creating unit 122, and generates a data extension image. create.
  • the image synthesizing unit 124 passes the created data augmented image to the data output unit 130 .
  • the data output unit 130 normalizes or standardizes each of the input image and teacher image passed from the data input unit 110 when data extension is not performed. Then, the data output unit 130 passes the normalized or standardized input image and the lighting environment of the input image, and the normalized or standardized teacher image and the lighting environment of the teacher image to the learning device 200 as learning data.
  • the data output unit 130 replaces the input image passed from the data input unit 110 with the data extended image passed from the image synthesis unit 124 of the data extension unit 120 . That is, in this case, the data output unit 130 normalizes or standardizes the input image rewritten to the data augmented image and the teacher image. Then, the data output unit 130 passes the normalized or standardized input image and the lighting environment of the input image, and the normalized or standardized teacher image and the lighting environment of the teacher image to the learning device 200 as learning data. That is, data output unit 130 passes new learning data different from the original learning data to learning device 200 .
  • FIG. 3 is a diagram showing an example of the hardware configuration of the learning data processing device 100.
  • the learning data processing device 100 includes a processor 11, a program memory 12, a data memory 13, an input/output interface 14, and a communication interface 15, for example.
  • Program memory 12 , data memory 13 , input/output interface 14 and communication interface 15 are connected to processor 11 via bus 16 .
  • the learning data processing device 100 may be composed of, for example, a general-purpose computer such as a personal computer.
  • the processor 11 includes a multi-core/multi-threaded CPU (Central Processing Unit), and is capable of concurrently executing multiple pieces of information processing.
  • a multi-core/multi-threaded CPU Central Processing Unit
  • the program memory 12 includes, as a storage medium, a non-volatile memory such as a HDD (Hard Disk Drive) or an SSD (Solid State Drive) that can be written and read at any time, and a non-volatile memory such as a ROM (Read Only Memory). , and stores programs necessary for executing various control processes according to the first embodiment of the present invention by being executed by a processor 11 such as a CPU. That is, the processor 11 can function as the data input unit 110, the data expansion unit 120, and the data output unit 130 as shown in FIG. 1 by reading and executing the programs stored in the program memory 12.
  • FIG. These processing function units may be realized by sequential processing of one CPU thread, or may be realized in a form in which simultaneous parallel processing is possible by separate CPU threads.
  • these processing function units may be realized by separate CPUs. That is, the learning data processing device 100 may include multiple CPUs. In addition, at least some of these processing function units may include integrated circuits such as ASICs (Application Specific Integrated Circuits), FPGAs (field-programmable gate arrays), GPUs (Graphics Processing Units), and various other hardware circuits. may be implemented in the form of The programs stored in the program memory 12 can include a learning data processing program as shown in FIG.
  • the data memory 13 uses, as a storage medium, a combination of a non-volatile memory such as an HDD or an SSD that can be written and read at any time, and a volatile memory such as a RAM (Random Access Memory). It is used to pre-store various data necessary for pre-processing data including
  • a mask image data set storage area 13A for storing mask image data sets can be reserved. That is, the data memory 13 can function as the mask image storage unit 123 .
  • a temporary storage area 13B can also be reserved for storing various data obtained and created during the process of pre-processing data including data extension.
  • the input/output interface 14 is an interface with an input device such as a keyboard and mouse (not shown) and an output device such as a liquid crystal monitor.
  • the input/output interface 14 may also include an interface with a memory card or disk medium reader/writer. If the mask image set is recorded on a memory card or disk medium and provided, the processor 11 can read it through the input/output interface 14 and store it in the mask image data set storage area 13A of the data memory 13. can.
  • the communication interface 15 includes, for example, one or more wired or wireless communication interface units, and enables transmission and reception of various information with devices on the network according to the communication protocol used on the network.
  • a wired interface for example, a wired LAN, a USB (Universal Serial Bus) interface, etc. are used.
  • An interface that adopts the power wireless data communication standard, etc. is used.
  • the processor 11 can receive and acquire learning data from the learning data storage unit 300 via the communication interface 15 .
  • processor 11 can obtain mask image data sets from devices on the network.
  • processor 11 can transmit learning data to learning device 200 via communication interface 15 .
  • FIG. 4 is a flowchart showing an example of the processing operation of the learning data processing device 100.
  • FIG. When the user instructs execution of the learning data processing program from an input device (not shown) through the input/output interface 14, the processor 11 starts the operation shown in this flow chart. Alternatively, the processor 11 may start the operation shown in this flow chart in response to an execution instruction from the learning device 200 on the network via the communication interface.
  • the processor 11 operates as the data input unit 110 to acquire learning data from the learning data storage unit 300 (step S11).
  • the acquired learning data is stored in the temporary storage area 13B of the data memory 13 .
  • the learning data includes the input image and the lighting environment of the input image, the teacher image and the lighting environment of the teacher image, and the target area image.
  • step S12 determines whether or not to perform data expansion to increase the influence of illumination using random parameters. If data extension is not to be performed (NO in step S12), the processor 11 proceeds to the process of step S20, which will be described later.
  • step S12 the processor 11 performs the operation as the luminance adjustment unit 121 and first acquires the input image and the target area image (step S13). . That is, the processor 11 reads out the input image and the target area image from the temporary storage area 13B. Passing the input image and the target area image from the data input unit 110 to the brightness adjustment unit 121 in the description of the configuration means saving and reading to the temporary storage area 13B in this manner. This also applies to the following description.
  • FIG. 5 is a diagram showing an example of the input image I.
  • FIG. 6 is a diagram showing an example of the target area image Mf .
  • the processor 11 performs luminance adjustment on the entire input image I to create a shadow/highlight image (step S14).
  • This brightness adjustment includes a brightness adjustment to decrease brightness when adding effect A simulating shadow as data extension, and a brightness adjustment increasing brightness when effect B simulating highlight is added as data extension.
  • Brightness adjustment techniques may use, for example, linear correction and gamma correction, and are preselected by the user.
  • the brightness adjustment parameter ⁇ the user sets the upper and lower limits in advance under the condition that ⁇ 1.0 for effect A and ⁇ >1.0 for effect B in both linear correction and gamma correction. and randomly determined within this range.
  • a shadow/highlight image J whose luminance has been adjusted is created as shown in Equation 1 below.
  • FIG. 7 is a diagram showing an example of a shadow/highlight image J that has undergone luminance adjustment in the case of adding an effect A imitating a shadow as data extension.
  • the shadow/highlight image J in this case is a shadow image.
  • FIG. 8 is a diagram showing an example of a shadow/highlight image J subjected to luminance adjustment in the case of adding an effect B simulating a highlight as data extension.
  • the shadow/highlight image J in this case is a highlight image.
  • the processor 11 stores the shadow/highlight image J thus created in the temporary storage area 13B.
  • the processor 11 executes the operation as the mask area creating unit 122 and determines whether to use the mask image data set stored in advance in the mask image storage unit 123, that is, the mask image data set storage area 13A (step S15). ). That is, the processor 11 determines whether to use a mask image prepared in advance or to create a mask image from the shadow/highlight image J based on random parameters.
  • the processor 11 acquires the mask image Md from the mask image data set stored in the mask image storage unit 123 using, for example, random parameters. (Step S16).
  • FIG. 9 is a diagram showing an example of the acquired mask image Md .
  • the processor 11 stores this acquired mask image Md in the temporary storage area 13B.
  • the processor 11 selects the shadow/highlight image J stored in the temporary storage area 13B as data extension.
  • a highlight image which is a shadow/highlight image J subjected to luminance adjustment when applying an effect B imitating , is read.
  • the processor 11 creates a mask image Md by performing arbitrary binarization processing on the shadow/highlight image J (step S17).
  • FIG. 10 is a diagram showing an example of a mask image Md created from this highlight image. The processor 11 stores this created mask image Md in the temporary storage area 13B.
  • the processor 11 executes the operation as the image synthesizing unit 124, reads out the input image I, the shadow/highlight image J, and the shadow/highlight application area image M from the temporary storage area 13B, and converts them into the following numbers. 3 to create a data extended image I' (step S19).
  • the shadow/highlight image J read here is a shadow image when the mask image Md is obtained from the mask image data set, a corresponding highlight image when the mask image Md is created from the shadow/highlight image J, becomes.
  • the processor 11 can determine which shadow/highlight image J is to be read by storing it in the temporary storage area 13B in step S16 or step S17 and reading it. Alternatively, when the mask image Md is stored in the temporary storage area 13B in step S16 or S17, the shadow/highlight image J not used for image synthesis may be deleted from the temporary storage area 13B.
  • FIG. 11 and 12 are diagrams showing an example of a reverse shadow/highlight imparting area image to be combined, indicated by 1-M in Equation 3.
  • FIG. 11 corresponds to the mask image Md of FIG. 9, and
  • FIG. 12 corresponds to the mask image Md of FIG.
  • FIG. 13 shows a data augmented image I' created from the input image I, a shadow/highlight image J which is a shadow image, and the reversed shadow/highlight added area image 1-M in FIG. 12 shows a data augmented image I' created from I, a shadow/highlight image J which is a highlight image, and the reverse shadow/highlight imparting area image 1-M of FIG.
  • the processor 11 then operates as the data output unit 130 and transmits learning data (step S20).
  • step S12 the processor 11 reads out the input image and teacher image stored in the temporary storage area 13B, and normalizes or standardizes them. and store it again in the temporary storage area 13B. Then, the processor 11 reads out the input image, the lighting environment of the input image, and the teacher image and the lighting environment of the teacher image from the temporary storage area 13B, and transmits them to the learning device 200 via the communication interface 15 .
  • the processor 11 stores the data in the temporary storage area 13B. read out the data augmented image I'. Then, the processor 11 normalizes or standardizes the data augmented image I', and saves the result as the input image I by overwriting the input image I already saved in the temporary storage area 13B. That is, the processor 11 rewrites the input image I stored in the temporary storage area 13B to the normalized or standardized data augmented image I'. The processor 11 also reads the teacher image stored in the temporary storage area 13B, normalizes or standardizes it, and stores it in the temporary storage area 13B again. Then, the processor 11 reads out the input image, the lighting environment of the input image, and the teacher image and the lighting environment of the teacher image from the temporary storage area 13B, and transmits them to the learning device 200 via the communication interface 15 .
  • the data input unit 110 obtains the input image I and the lighting environment of the input image I from the learning data storage unit 300, and only the lighting environment from the input image I.
  • Acquiring learning data used for learning a deep generative model including a teacher image that is a modified image, the lighting environment of the teacher image, and a target region image M f that indicates a brightness change target region in the input image I;
  • the data extension unit 120 creates a shadow/highlight image J, which is a brightness adjusted image obtained by performing brightness adjustment on the input image I, and acquires or creates a mask image Md indicating a brightness change area to which brightness change is applied.
  • the learning data processing device 100 uses the data output unit 130 to change the input image I in the learning data to the data augmented image I′ to create new learning data, and converts the new learning data into the deep generation model. It is output to the learning device 200 as learning data used for learning.
  • the learning data processing apparatus 100 creates the data augmented image I′ based on the learning data, and creates new learning data including the data augmented image I′, so that the deep layer
  • the number of pieces of learning data used for learning the generative model can be increased. Therefore, the learning device 200 can learn a deep generative model using learning data with increased influence of an irregular lighting environment, and realize learning of a deep generative model that is robust against shadows or highlights. becomes possible.
  • the data extension unit 120 includes the brightness adjustment unit 121 that creates the shadow/highlight image J, which is a brightness adjusted image, by lowering the brightness of the entire input image I, and the target region image M f , the shadow/highlight image J and the mask image Md are combined to create a data augmented image I′ in which the portion corresponding to the luminance change area in the area corresponding to the luminance change target area in the input image I is darkened. and an image synthesizing unit 124 to create.
  • the brightness adjustment unit 121 that creates the shadow/highlight image J, which is a brightness adjusted image, by lowering the brightness of the entire input image I, and the target region image M f , the shadow/highlight image J and the mask image Md are combined to create a data augmented image I′ in which the portion corresponding to the luminance change area in the area corresponding to the luminance change target area in the input image I is darkened.
  • an image synthesizing unit 124 to create.
  • the learning data processing apparatus 100 adds an image effect simulating a shadow to the learning data, thereby increasing the pattern of the lighting environment in the learning data in a pseudo manner. It is possible to realize robust deep generative model learning for shadows in the device 200 .
  • the data extension unit 120 includes the brightness adjustment unit 121 that creates the shadow/highlight image J, which is a brightness adjustment image, by increasing the brightness of the entire input image I, and the target region image M f , the shadow/highlight image J and the mask image M d are combined to create a data augmented image I′ in which the portion corresponding to the luminance change area in the area corresponding to the luminance change target area in the input image I is brightened. and an image synthesizing unit 124 to create.
  • the brightness adjustment unit 121 that creates the shadow/highlight image J, which is a brightness adjustment image, by increasing the brightness of the entire input image I, and the target region image M f , the shadow/highlight image J and the mask image M d are combined to create a data augmented image I′ in which the portion corresponding to the luminance change area in the area corresponding to the luminance change target area in the input image I is brightened.
  • an image synthesizing unit 124 to create.
  • the learning data processing apparatus 100 adds an image effect simulating a highlight to the learning data, thereby increasing the number of pseudo lighting environment patterns in the learning data. It is possible to realize robust deep generative model learning for highlights in the learning device 200 .
  • the data expansion unit 120 includes the mask area creation unit 122 that creates the mask image Md by performing arbitrary binarization processing on the shadow/highlight image J, which is the brightness adjustment image. including.
  • the learning data processing apparatus 100 creates a mask image Md based on the shadow/highlight image J, that is, the input image I, which is a luminance adjustment image, and uses this mask image Md . Since the data-extended image I' is generated as a result, the probability that the data-extended image I' deviating greatly from the input image I will be generated can be reduced.
  • the data expansion unit 120 further extracts mask images from the mask image data set MIDS, which is a data set of irregular mask images stored in the mask image storage unit 123 in advance. It includes a mask region generator 122 that obtains M d .
  • the learning data processing apparatus 100 can create the data augmented image I' using various mask images Md independent of the input image I, and the number of learning data can be reduced to It becomes possible to increase easily.
  • a data-extended image obtained by adding an image effect simulating a shadow to the learning data, or a data-extended image I′ obtained by performing data extension adding an image effect simulating a highlight to the learning data. is creating That is, one of the data extension images I' is created. However, both of these two types of data extension images I' may be created.
  • FIG. 15 is a flow chart showing an example of the processing operation of the learning data processing device according to the second embodiment of the present invention.
  • the determination process of step S15 in the first embodiment is omitted, and the processor 11 obtains the mask image Md in step S16 and creates the mask image Md in step S17. do both.
  • the processor 11 includes multi-threaded CPUs, these processes can be performed concurrently in separate threads.
  • the process of step S16 and the process of step S17 may be performed sequentially. In this case, after performing the process of step S16, the process of step S17 may be performed, or the order may be reversed.
  • step S18 the processor 11 creates two types of shadow/highlight addition area images M using the respective mask images M d , and in the process of step S19, two types of data extension images I ' to create.
  • the processor 11 transmits learning data including these two types of data augmented images I' to the learning device 200 in step S20.
  • the learning data processing apparatus 100 produces a shadow/highlight image J obtained by reducing the luminance of the entire input image I and a shadow/highlight image J obtained by increasing the luminance of the entire input image I. is used to obtain a data extension image I′ in which a portion corresponding to the brightness change region is darkened in the region corresponding to the brightness change target region in the input image I, and a data extension image in which a portion corresponding to the brightness change region is brightened Create an image I'.
  • the learning data processing apparatus 100 adds image effects simulating shadows and highlights to the learning data, thereby increasing the number of pseudo lighting environment patterns in the learning data. This makes it possible to realize robust deep generative model learning for shadows and highlights in the learning device 200 .
  • FIG. 16 is a block diagram showing an example of the configuration of a deep generative model learning system including the learning data processing device 100 according to the third embodiment of the present invention.
  • a learning data processing apparatus 100 includes an evaluation unit 140 in addition to the configuration of the first embodiment. Then, the image synthesizing unit 124 of the data extension unit 120 adds the created data extension image I′ to the data output unit 130 and also passes it to the evaluation unit 140 .
  • the evaluation unit 140 has an internal anomaly detection model and evaluates the data augmented image I'.
  • the anomaly detection model consists of an image group A with shadows and highlights obtained from the actual image, and an image group B with shadows and highlights arbitrarily created so as to greatly deviate from the actual image. was used as learning data, and learning was performed using distance learning.
  • the evaluation unit 140 acquires the data extension image I' from the data extension unit 120, inputs this data extension image I' to the anomaly detection model, and obtains an evaluation value. Then, when the evaluation value exceeds the threshold set by the user, the evaluation unit 140 regards the data extension image I′ as an image greatly deviating from the actual image, discards it, and returns the image to the data extension unit 120 again. , to perform data augmentation.
  • FIG. 17 is a flow chart showing an example of the processing operation of the learning data processing device 100 according to the third embodiment. That is, following the processing of step S19, the processor 11 reads out the data augmented image I' stored in the temporary storage area 13B, and evaluates the data augmented image I' using the anomaly detection model learned in advance (step S31).
  • step S32 determines whether the obtained evaluation value is equal to or less than the threshold. If the evaluation value is equal to or less than the threshold (YES in step S32), the data augmented image I' does not deviate greatly from the actual image, and is considered suitable for learning data of the learning device 200, and the process proceeds to step S20. . As a result, learning data including the data augmented image I′ is transmitted to the learning device 200 .
  • the processor 11 temporarily determines that the data augmented image I′ is an image greatly deviating from the actual image.
  • the data augmented image I' is deleted from the storage area 13B, and the process from step S13 is repeated. As a result, a new data augmented image I' can be created by changing the mask image.
  • the evaluation unit 140 evaluates the data augmented image I′ created by the data extension unit 120, and the data augmented image I′ is larger than the actual image. If the image is a deviated image, the data extension unit 120 is caused to create the data extension image I′ again.
  • the learning data processing device 100 evaluates the data augmented image I′ to create learning data that is not suitable for use in learning the deep generative model in the learning device 200. It is possible to prevent it from being done.
  • the learning data processing device 100 may also include the evaluation unit 140 as in the third embodiment.
  • the brightness adjustment unit 121 creates both a shadow image and a highlight image as the shadow/highlight image J, and the mask area creation unit 122 randomly selects one of them. shall be used for Instead of doing so, the brightness adjustment unit 121 randomly generates any image, and the mask area creation unit 122 uses a mask image corresponding to the shadow/highlight image J generated by the brightness adjustment unit 121. It is good to do.
  • one data augmented image I' of one type is created, and in the second embodiment, two types of data augmented image I' are created one by one.
  • the number of data extension images I' to be created may be increased by creating them.
  • the shadow/highlight image J when the data augmented image I' is created in step S19, if the mask image Md has been obtained from the mask image data set, the shadow/highlight image J However, the highlight image may be used as the shadow/highlight image J in such a case as well. Which image to use as the shadow/highlight image J may be determined by a random parameter, or both may be used to create two types of data extension images I'.
  • the learning data storage unit 300 may be configured as part of the learning data processing device 100 . That is, the data memory 13 may be provided with a storage area as the learning data storage unit 300 .
  • the learning device 200 may incorporate the functions of the learning data processing device 100 of the embodiment.
  • the method described in each embodiment can be executed by a computer (computer) as a program (software means), such as a magnetic disk (floppy (registered trademark) disk, hard disk, etc.), an optical disk (CD-ROM, DVD , MO, etc.), a semiconductor memory (ROM, RAM, flash memory, etc.), or the like, or may be transmitted and distributed via a communication medium.
  • the programs stored on the medium also include a setting program for configuring software means (including not only execution programs but also tables and data structures) to be executed by the computer.
  • a computer that realizes this apparatus reads a program recorded on a recording medium, and in some cases, builds software means by a setting program, and executes the above-described processes by controlling the operation by this software means.
  • the term "recording medium" as used in this specification includes not only those for distribution, but also storage media such as magnetic disks, semiconductor memories, etc. provided in computers or devices connected via a network.
  • the present invention is not limited to the above embodiments, and can be modified in various ways without departing from the gist of the invention at the implementation stage. Moreover, each embodiment may be implemented in combination as much as possible, in which case the combined effect can be obtained. Furthermore, the above-described embodiments include inventions at various stages, and various inventions can be extracted by appropriately combining a plurality of disclosed constituent elements.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

A training data processing device according to one embodiment of the present invention comprises a data input unit, data expansion unit, and data output unit. The data input unit acquires training data to be used for training a deep generative model including: an input image and an illumination environment for the input image; a teacher image obtained by changing only the illumination environment from the input image, and an illumination environment for the teacher image; and a targeted region image indicating a brightness change region in the input image. The data expansion unit creates a brightness adjustment image obtained by subjecting the input image to brightness adjustment and also creates a mask image indicating a brightness change region in which brightness change is to be made, and synthesizes the targeted region image, the brightness adjustment image, and the mask image so as to create a data expansion image. The data output unit changes the input image in the training data to the data expansion image so as to create new training data, and outputs the new training data as training data to be used for training the deep generative model.

Description

学習データ処理装置、学習データ処理方法及び学習データ処理プログラムLEARNING DATA PROCESSING DEVICE, LEARNING DATA PROCESSING METHOD AND LEARNING DATA PROCESSING PROGRAM

 この発明の実施形態は、学習データ処理装置、学習データ処理方法及び学習データ処理プログラムに関する。 Embodiments of the present invention relate to a learning data processing device, a learning data processing method, and a learning data processing program.

 再照明(Relighting)は、入力画像に対して画像内の照明環境を所望のものに変更した再照明画像を生成する技術である。この再照明技術では、入力画像から所望の再照明画像を生成するのに、深層学習を利用している。 Relighting is a technique for generating a relighted image by changing the lighting environment in the image to the desired one for the input image. This relighting technique uses deep learning to generate the desired relighted image from the input image.

 深層学習を利用した手法では、例えば、非特許文献1に提案されているように、入力画像と入力画像から照明環境のみを変更した教師画像とをペアにした学習データを用いて、再照明画像を生成する深層生成モデルの学習を行う。 In a method using deep learning, for example, as proposed in Non-Patent Document 1, a re-illuminated image is obtained using learning data in which an input image and a teacher image obtained by changing only the lighting environment from the input image are paired. Train a deep generative model that generates

 また、例えば、非特許文献2は、このような学習データを現実環境で作成する際に、多数のカメラ及び照明に囲まれたような特殊な設備を準備して、様々な撮影条件及び照明条件で撮影することを提案している。 Also, for example, Non-Patent Document 2 prepares a special facility surrounded by a large number of cameras and lighting when creating such learning data in a real environment, and prepares various shooting conditions and lighting conditions. I suggest shooting with

 なお、学習に用いる入力画像は、建物や木等で照明環境が遮られることで生じる顔領域に大域的にかかる陰影が無い画像であることが望ましく、例えば、非特許文献3は、そのような陰影の除去手法を提案している。 It is desirable that the input image used for learning is an image that does not have global shadows on the face region caused by the lighting environment being blocked by buildings or trees. A shadow removal method is proposed.

T. SUN, et al, "Single Image Portrait Relighting," SIGGRAPH2019.T. SUN, et al, "Single Image Portrait Relighting," SIGGRAPH2019. K. Guo, et al, "The Relightables: Volumetric Performance Capture of Humans with Realistic Relighting," SIGGRAPH 2020.K. Guo, et al, "The Relightables: Volumetric Performance Capture of Humans with Realistic Relighting," SIGGRAPH 2020. X. Zhang, et, al, "Portrait Shadow Manipulation," SIGGRAPH2020.X. Zhang, et, al, "Portrait Shadow Manipulation," SIGGRAPH2020.

 上記非特許文献2で提案されているような特殊な設備は、実環境で発生し得る不規則的な陰影及びハイライトまで考慮するには、設定する撮影条件及び照明条件のパターンが膨大となり、現実的ではない。 Special equipment such as that proposed in Non-Patent Document 2 above requires a huge number of patterns of shooting conditions and lighting conditions to be set in order to consider irregular shadows and highlights that may occur in the actual environment. Not realistic.

 よって、非特許文献1のような再照明画像を生成する深層生成モデルを、非特許文献2で作成される学習データを用いて学習する場合、学習データ内に照明環境のパターンが少なくなってしまう。そのため、深層生成モデルの学習後、入力画像として学習データに含まれないような陰影或いはハイライトがのった画像から再照明画像を生成する際には、生成された再照明画像上に陰影又はハイライトが残ってしまう。 Therefore, when a deep generative model that generates a re-illuminated image as in Non-Patent Document 1 is learned using the learning data created in Non-Patent Document 2, the number of lighting environment patterns in the learning data is reduced. . Therefore, after learning the deep generation model, when generating a relight image from an image with shadows or highlights that are not included in the training data as an input image, shadows or highlights are added to the generated relight image. It leaves highlights.

 再照明画像の陰影を除去するために、非特許文献3のような陰影除去処理がさらに必要になってしまう。 In order to remove the shadows in the re-illumination image, shadow removal processing like that of Non-Patent Document 3 is further required.

 この発明は、陰影又はハイライトに対し頑健な深層生成モデルの学習を実現することが可能となる技術を提供しようとするものである。 The present invention seeks to provide a technique that makes it possible to implement the learning of deep generative models that are robust to shadows or highlights.

 上記課題を解決するために、この発明の一態様に係る学習データ処理装置は、データ入力部と、データ拡張部と、データ出力部と、を備える。データ入力部は、入力画像及び前記入力画像の照明環境と、前記入力画像から照明環境のみを変更した画像である教師画像及び前記教師画像の照明環境と、前記入力画像中の輝度変更対象領域を示す対象領域画像と、を含む、深層生成モデルの学習に利用する学習データを取得する。データ拡張部は、前記入力画像に対して輝度調整を行った輝度調整画像を作成すると共に、輝度変更を加える輝度変更領域を示すマスク画像を作成し、前記対象領域画像、前記輝度調整画像及び前記マスク画像を合成して、データ拡張画像を作成する。データ出力部は、前記学習データ中の前記入力画像を前記データ拡張画像に変更して新たな学習データを作成し、前記新たな学習データを、前記深層生成モデルの学習に利用する前記学習データとして出力する。 In order to solve the above problems, a learning data processing device according to one aspect of the present invention includes a data input section, a data extension section, and a data output section. The data input unit inputs an input image, an illumination environment of the input image, a teacher image that is an image obtained by changing only the illumination environment from the input image, an illumination environment of the teacher image, and a brightness change target area in the input image. Learning data used for learning the deep generative model is acquired, including the target region image shown in FIG. A data extension unit creates a brightness-adjusted image obtained by performing brightness adjustment on the input image, creates a mask image indicating a brightness-changed region to which brightness is to be changed, and extracts the target region image, the brightness-adjusted image, and the brightness-adjusted image. Synthesize the mask image to create a data augmented image. The data output unit creates new learning data by changing the input image in the learning data to the data augmentation image, and uses the new learning data as the learning data used for learning the deep generative model. Output.

 この発明の一態様によれば、陰影又はハイライトに対し頑健な深層生成モデルの学習を実現することが可能となる技術を提供することができる。 According to one aspect of the present invention, it is possible to provide a technology that enables learning of a deep generative model robust to shadows or highlights.

図1は、この発明の第1実施形態に係る学習データ処理装置を備える深層生成モデル学習システムの構成の一例を示すブロック図である。FIG. 1 is a block diagram showing an example of the configuration of a deep generative model learning system comprising a learning data processing device according to the first embodiment of the present invention. 図2は、学習データ処理装置が有するマスク画像データセットの一例を示す図である。FIG. 2 is a diagram showing an example of a mask image data set possessed by the learning data processing device. 図3は、学習データ処理装置のハードウェア構成の一例を示す図である。FIG. 3 is a diagram illustrating an example of a hardware configuration of a learning data processing device; 図4は、学習データ処理装置の処理動作の一例を示すフローチャートである。FIG. 4 is a flow chart showing an example of the processing operation of the learning data processing device. 図5は、学習データの一つである入力画像の一例を示す図である。FIG. 5 is a diagram showing an example of an input image that is one of learning data. 図6は、学習データの一つである対象領域画像の一例を示す図である。FIG. 6 is a diagram showing an example of a target region image, which is one of learning data. 図7は、学習データ処理装置が処理途中で作成する陰影・ハイライト画像の一例を示す図である。FIG. 7 is a diagram showing an example of a shadow/highlight image created by the learning data processing device during processing. 図8は、陰影・ハイライト画像の別の一例を示す図である。FIG. 8 is a diagram showing another example of a shadow/highlight image. 図9は、マスク画像の一例を示す図である。FIG. 9 is a diagram showing an example of a mask image. 図10は、マスク画像の別の一例を示す図である。FIG. 10 is a diagram showing another example of the mask image. 図11は、画像合成される反転陰影・ハイライト付与領域画像の一例を示す図である。FIG. 11 is a diagram showing an example of a reverse shadow/highlight imparting area image to be combined. 図12は、画像合成される反転陰影・ハイライト付与領域画像の別の一例を示す図である。FIG. 12 is a diagram showing another example of a reverse shadow/highlight imparting area image to be combined. 図13は、学習データ処理装置が作成するデータ拡張画像の一例を示す図である。FIG. 13 is a diagram illustrating an example of a data augmented image created by the learning data processing device; 図14は、学習データ処理装置が作成するデータ拡張画像の別の一例を示す図である。FIG. 14 is a diagram showing another example of a data augmented image created by the learning data processing device. 図15は、この発明の第2実施形態に係る学習データ処理装置の処理動作の一例を示すフローチャートである。FIG. 15 is a flow chart showing an example of the processing operation of the learning data processing device according to the second embodiment of the present invention. 図16は、この発明の第3実施形態に係る学習データ処理装置を備える深層生成モデル学習システムの構成の一例を示すブロック図である。FIG. 16 is a block diagram showing an example of the configuration of a deep generative model learning system including a learning data processing device according to the third embodiment of the present invention. 図17は、第3実施形態に係る学習データ処理装置の処理動作の一例を示すフローチャートである。FIG. 17 is a flow chart showing an example of the processing operation of the learning data processing device according to the third embodiment.

 以下、図面を参照して、この発明に係わる実施形態を説明する。 Hereinafter, embodiments according to the present invention will be described with reference to the drawings.

 [第1実施形態]
 (構成例)
 図1は、この発明の第1実施形態に係る学習データ処理装置100を備える深層生成モデル学習システムの構成の一例を示すブロック図である。深層生成モデル学習システムは、この学習データ処理装置100と、学習装置200と、学習データ記憶部300と、を備える。なお、深層生成モデル学習システムは、これら各部を1つの装置や筐体として一体として構成していても良いし、複数の装置から構成しても良い。また、複数の装置が遠隔に配置され、ネットワークを経由して接続されていても良い。
[First embodiment]
(Configuration example)
FIG. 1 is a block diagram showing an example of the configuration of a deep generative model learning system including a learning data processing device 100 according to the first embodiment of the present invention. The deep generative model learning system includes this learning data processing device 100 , a learning device 200 and a learning data storage unit 300 . Note that the deep generative model learning system may be configured such that each of these units is integrated as one device or housing, or may be configured from a plurality of devices. Also, multiple devices may be remotely located and connected via a network.

 学習データ記憶部300は、学習装置200での学習に必要な学習データを格納する。学習データは、入力画像と入力画像の照明環境、入力画像から照明環境のみを変更した画像である教師画像と教師画像の照明環境、及び、入力画像中の輝度変更対象領域すなわち陰影又はハイライトを付与する領域(例えば、人物の顔の部分)を示す対象領域画像、を含む。この内、照明環境としては、例えば、球面調和級数を用いたベクトルデータ、又は、画像周囲の映り込みを表現した環境マップ画像、の何れかを持つ。用意した学習データ全てを一度、学習データ記憶部300から学習データ処理装置100へ渡すことを1エポックとし、各エポックにおいて学習データの順番がランダムに並び替えて、学習データ処理装置100へ渡される。 The learning data storage unit 300 stores learning data necessary for learning in the learning device 200. The learning data includes the input image and the lighting environment of the input image, the teacher image that is an image obtained by changing only the lighting environment from the input image, the lighting environment of the teacher image, and the brightness change target area, that is, the shadow or highlight in the input image. and a target area image showing the area to be applied (eg, a portion of a person's face). Of these, the lighting environment has, for example, either vector data using spherical harmonics or an environment map image expressing reflections around the image. One epoch is when all of the prepared learning data is transferred from the learning data storage unit 300 to the learning data processing device 100 once.

 学習データ処理装置100は、学習データ記憶部300から取得した学習データに対してデータ拡張を含むデータの事前処理を行う。データ拡張とは、学習データに陰影又はハイライトを模した画像効果を加える処理を言う。学習データ処理装置100は、この事前処理を行った学習データを学習装置200へ渡す。 The learning data processing device 100 performs data preprocessing including data extension on the learning data acquired from the learning data storage unit 300 . Data augmentation refers to a process of adding an image effect that simulates shadows or highlights to learning data. The learning data processing device 100 passes the preprocessed learning data to the learning device 200 .

 学習装置200は、学習データ処理装置100から渡された学習データを用いて深層生成モデルの学習を行う。 The learning device 200 uses the learning data passed from the learning data processing device 100 to learn the deep generative model.

 また、学習装置200は、学習した深層生成モデルを用いて、任意の入力画像から再照明画像を生成する。入力画像は、学習データ処理装置100を介して取得しても良いし、図示しない入力装置やネットワークを介して取得するものであっても良い。さらに、学習装置200は、生成した再照明画像と学習データとの評価を行うことで、深層生成モデルのパラメータの更新及び深層生成モデルの記録を行う。 Also, the learning device 200 uses the learned deep generation model to generate a re-illuminated image from an arbitrary input image. The input image may be acquired via the learning data processing device 100, or may be acquired via an input device or a network (not shown). Furthermore, the learning device 200 updates the parameters of the deep generative model and records the deep generative model by evaluating the generated re-illumination image and the learning data.

 図1に示すように、学習データ処理装置100は、データ入力部110、データ拡張部120及びデータ出力部130を備える。 As shown in FIG. 1, the learning data processing device 100 includes a data input section 110, a data extension section 120 and a data output section .

 データ入力部110は、学習データ記憶部300から、学習データ、つまり入力画像と入力画像の照明環境、教師画像と教師画像の照明環境、及び、対象領域画像を取得する。データ入力部110は、この学習データの内、入力画像と入力画像の照明環境、及び、教師画像と教師画像の照明環境を、データ出力部130に渡す。また、データ入力部110は、ランダムなパラメータで、照明の影響を増やすデータ拡張を行うか決定し、データ拡張を行う場合は、学習データの内の入力画像と対象領域画像とを、データ拡張部120に渡す。 The data input unit 110 acquires the learning data, that is, the input image and the lighting environment of the input image, the teacher image and the lighting environment of the teacher image, and the target area image from the learning data storage unit 300 . The data input unit 110 passes the input image and the lighting environment of the input image and the teacher image and the lighting environment of the teacher image among the learning data to the data output unit 130 . In addition, the data input unit 110 uses random parameters to determine whether to perform data extension that increases the influence of illumination. Hand over to 120.

 データ拡張部120は、輝度調整部121、マスク領域作成部122、マスク画像記憶部123、及び、画像合成部124を備える。 The data extension unit 120 includes a luminance adjustment unit 121, a mask area creation unit 122, a mask image storage unit 123, and an image synthesis unit 124.

 輝度調整部121は、データ入力部110から渡された入力画像を画像合成部124へ渡す。また輝度調整部121は、入力画像に対して輝度調整を行った輝度調整画像である陰影・ハイライト画像を作成する。そして、輝度調整部121は、この作成した陰影・ハイライト画像と、データ入力部110から渡された対象領域画像とを、マスク領域作成部122へ渡す。 The brightness adjustment unit 121 passes the input image passed from the data input unit 110 to the image synthesizing unit 124 . Also, the luminance adjustment unit 121 creates a shadow/highlight image, which is a luminance-adjusted image obtained by performing luminance adjustment on the input image. Then, the brightness adjustment unit 121 passes the created shadow/highlight image and the target region image passed from the data input unit 110 to the mask region generation unit 122 .

 マスク画像記憶部123は、不規則的なマスク画像のデータセットであるマスク画像データセットを記憶している。図2は、このマスク画像データセットの一例を示す図である。マスク画像には、既存のデータセットを用いても良いし、パーリンノイズを用いて作成した画像を用いても良い。 The mask image storage unit 123 stores a mask image data set, which is a data set of irregular mask images. FIG. 2 is a diagram showing an example of this mask image data set. An existing data set may be used for the mask image, or an image created using Perlin noise may be used.

 マスク領域作成部122は、輝度調整部121から渡された陰影・ハイライト画像及び対象領域画像と、マスク画像とから、陰影又はハイライトをのせる領域を示す陰影・ハイライト付与領域画像を作成する。マスク画像は、マスク画像記憶部123に事前に記憶されているマスク画像であっても良いし、陰影・ハイライト画像に任意の2値化処理を行うことで作成しても良い。マスク領域作成部122は、事前に記憶されているマスク画像と作成したマスク画像のどちらを用いるかを、ランダムなパラメータで決定することができる。マスク領域作成部122は、陰影・ハイライト画像と、作成した陰影・ハイライト付与領域画像とを、画像合成部124へ渡す。 The mask area creation unit 122 creates a shadow/highlight application area image indicating an area to be shaded or highlighted from the shadow/highlight image and the target area image passed from the brightness adjustment unit 121 and the mask image. do. The mask image may be a mask image stored in advance in the mask image storage unit 123, or may be created by subjecting the shadow/highlight image to arbitrary binarization processing. The mask area creation unit 122 can determine which of the pre-stored mask image and the created mask image is to be used using a random parameter. The mask area creating unit 122 passes the shadow/highlight image and the created shadow/highlight adding area image to the image synthesizing unit 124 .

 画像合成部124は、輝度調整部121から渡された入力画像と、マスク領域作成部122から渡された陰影・ハイライト画像及び陰影・ハイライト付与領域画像とを合成して、データ拡張画像を作成する。画像合成部124は、作成したデータ拡張画像をデータ出力部130へ渡す。 The image synthesizing unit 124 synthesizes the input image passed from the luminance adjusting unit 121, the shadow/highlight image and the shadow/highlight added area image passed from the mask area creating unit 122, and generates a data extension image. create. The image synthesizing unit 124 passes the created data augmented image to the data output unit 130 .

 データ出力部130は、データ拡張を行わない場合には、データ入力部110から渡された入力画像及び教師画像それぞれに、正規化又は標準化を行う。そして、データ出力部130は、正規化又は標準化した入力画像と入力画像の照明環境、及び、正規化又は標準化した教師画像と教師画像の照明環境を、学習データとして学習装置200へ渡す。 The data output unit 130 normalizes or standardizes each of the input image and teacher image passed from the data input unit 110 when data extension is not performed. Then, the data output unit 130 passes the normalized or standardized input image and the lighting environment of the input image, and the normalized or standardized teacher image and the lighting environment of the teacher image to the learning device 200 as learning data.

 また、データ出力部130は、データ拡張を行う場合には、データ入力部110から渡された入力画像を、データ拡張部120の画像合成部124から渡されたデータ拡張画像へ置き換える。すなわち、この場合には、データ出力部130は、このデータ拡張画像に書き換えられた入力画像と教師画像それぞれに、正規化又は標準化を行うこととなる。そして、データ出力部130は、正規化又は標準化した入力画像と入力画像の照明環境、及び、正規化又は標準化した教師画像と教師画像の照明環境を、学習データとして学習装置200へ渡す。すなわち、データ出力部130は、元の学習データとは異なる新たな学習データを学習装置200へ渡す。 Also, when performing data extension, the data output unit 130 replaces the input image passed from the data input unit 110 with the data extended image passed from the image synthesis unit 124 of the data extension unit 120 . That is, in this case, the data output unit 130 normalizes or standardizes the input image rewritten to the data augmented image and the teacher image. Then, the data output unit 130 passes the normalized or standardized input image and the lighting environment of the input image, and the normalized or standardized teacher image and the lighting environment of the teacher image to the learning device 200 as learning data. That is, data output unit 130 passes new learning data different from the original learning data to learning device 200 .

 図3は、学習データ処理装置100のハードウェア構成の一例を示す図である。学習データ処理装置100は、例えば、プロセッサ11、プログラムメモリ12、データメモリ13、入出力インタフェース14、及び、通信インタフェース15を備える。プログラムメモリ12、データメモリ13、入出力インタフェース14、及び、通信インタフェース15は、バス16を介してプロセッサ11に接続されている。学習データ処理装置100は、例えば、パーソナルコンピュータなどの汎用的なコンピュータで構成しても良い。 FIG. 3 is a diagram showing an example of the hardware configuration of the learning data processing device 100. As shown in FIG. The learning data processing device 100 includes a processor 11, a program memory 12, a data memory 13, an input/output interface 14, and a communication interface 15, for example. Program memory 12 , data memory 13 , input/output interface 14 and communication interface 15 are connected to processor 11 via bus 16 . The learning data processing device 100 may be composed of, for example, a general-purpose computer such as a personal computer.

 プロセッサ11は、マルチコア・マルチスレッド数のCPU(Central Processing Unit)を含み、複数の情報処理を並行して同時実行することができる。 The processor 11 includes a multi-core/multi-threaded CPU (Central Processing Unit), and is capable of concurrently executing multiple pieces of information processing.

 プログラムメモリ12は、記憶媒体として、例えば、HDD(Hard Disk Drive)、SSD(Solid State Drive)等の随時書込み及び読出しが可能な不揮発性メモリと、ROM(Read Only Memory)等の不揮発性メモリとを組み合わせて使用したもので、CPU等のプロセッサ11が実行することで、この発明の第1実施形態に係る各種制御処理を実行するために必要なプログラムを記憶している。すなわち、プロセッサ11は、プログラムメモリ12に記憶されたプログラムを読み出して実行することで、図1に示すようなデータ入力部110、データ拡張部120及びデータ出力部130として機能することができる。なお、これらの処理機能部は、1つのCPUスレッドの順次処理によって実現されても良いし、それぞれ別個のCPUスレッドで同時並行処理可能な形態で実現されも良い。また、これらの処理機能部は、別個のCPUで実現されても良い。すなわち、学習データ処理装置100は、複数のCPUを備えていても良い。また、これらの処理機能部の少なくとも一部は、ASIC(Application Specific Integrated Circuit)、FPGA(field-programmable gate array)、GPU(Graphics Processing Unit)等の集積回路を含む、他の多様なハードウェア回路の形式で実現されても良い。なお、プログラムメモリ12に記憶されるプログラムは、図3に示すような学習データ処理プログラムを含むことができる。 The program memory 12 includes, as a storage medium, a non-volatile memory such as a HDD (Hard Disk Drive) or an SSD (Solid State Drive) that can be written and read at any time, and a non-volatile memory such as a ROM (Read Only Memory). , and stores programs necessary for executing various control processes according to the first embodiment of the present invention by being executed by a processor 11 such as a CPU. That is, the processor 11 can function as the data input unit 110, the data expansion unit 120, and the data output unit 130 as shown in FIG. 1 by reading and executing the programs stored in the program memory 12. FIG. These processing function units may be realized by sequential processing of one CPU thread, or may be realized in a form in which simultaneous parallel processing is possible by separate CPU threads. Also, these processing function units may be realized by separate CPUs. That is, the learning data processing device 100 may include multiple CPUs. In addition, at least some of these processing function units may include integrated circuits such as ASICs (Application Specific Integrated Circuits), FPGAs (field-programmable gate arrays), GPUs (Graphics Processing Units), and various other hardware circuits. may be implemented in the form of The programs stored in the program memory 12 can include a learning data processing program as shown in FIG.

 データメモリ13は、記憶媒体として、例えば、HDD又はSSD等の随時書込み及び読出しが可能な不揮発性メモリと、RAM(Random Access Memory)等の揮発性メモリとを組み合わせて使用したもので、データ拡張を含むデータの事前処理を行うのに必要な各種データを事前記憶するために用いられる。例えば、データメモリ13には、マスク画像データセットを記憶するためのマスク画像データセット記憶領域13Aが確保されることができる。すなわち、データメモリ13は、マスク画像記憶部123として機能することがでる。また、データメモリ13には、データ拡張を含むデータの事前処理を行う過程で取得及び作成された各種データを記憶するために用いられる一時記憶領域13Bも確保されることができる。 The data memory 13 uses, as a storage medium, a combination of a non-volatile memory such as an HDD or an SSD that can be written and read at any time, and a volatile memory such as a RAM (Random Access Memory). It is used to pre-store various data necessary for pre-processing data including For example, in the data memory 13, a mask image data set storage area 13A for storing mask image data sets can be reserved. That is, the data memory 13 can function as the mask image storage unit 123 . In the data memory 13, a temporary storage area 13B can also be reserved for storing various data obtained and created during the process of pre-processing data including data extension.

 入出力インタフェース14は、図示しないキーボードやマウス等の入力装置、及び、液晶モニタ等の出力装置とのインタフェースである。また、入出力インタフェース14は、メモリカードやディスク媒体のリーダ/ライタとのインタフェースを含み得る。マスク画像セットがメモリカード又はディスク媒体に記録されて提供される場合、プロセッサ11は、入出力インタフェース14を介してそれを読み出して、データメモリ13のマスク画像データセット記憶領域13Aに保存することができる。 The input/output interface 14 is an interface with an input device such as a keyboard and mouse (not shown) and an output device such as a liquid crystal monitor. The input/output interface 14 may also include an interface with a memory card or disk medium reader/writer. If the mask image set is recorded on a memory card or disk medium and provided, the processor 11 can read it through the input/output interface 14 and store it in the mask image data set storage area 13A of the data memory 13. can.

 通信インタフェース15は、例えば1つ以上の有線又は無線の通信インタフェースユニットを含んでおり、ネットワークで使用される通信プロトコルに従い、ネットワーク上の機器との間で各種情報の送受信を可能にする。有線インタフェースとしては、例えば有線LAN、USB(Universal Serial Bus)インタフェース、等が使用され、また無線インタフェースとしては、例えば4G又は5Gなどの携帯電話通信システム、無線LAN、Bluetooth(登録商標)等の小電力無線データ通信規格を採用したインタフェース、等が使用される。例えば、学習データ記憶部300がネットワーク上のファイルサーバ等に配置される場合、プロセッサ11は、通信インタフェース15を介して学習データ記憶部300から学習データを受信して取得することができる。同様にして、プロセッサ11は、ネットワーク上の機器からマスク画像データセットを取得することもできる。また、学習データ記憶部300がネットワーク上のサーバ装置等に配置される場合、プロセッサ11は、通信インタフェース15を介して学習データを学習装置200へ送信することができる。 The communication interface 15 includes, for example, one or more wired or wireless communication interface units, and enables transmission and reception of various information with devices on the network according to the communication protocol used on the network. As a wired interface, for example, a wired LAN, a USB (Universal Serial Bus) interface, etc. are used. An interface that adopts the power wireless data communication standard, etc. is used. For example, when the learning data storage unit 300 is arranged in a file server or the like on the network, the processor 11 can receive and acquire learning data from the learning data storage unit 300 via the communication interface 15 . Similarly, processor 11 can obtain mask image data sets from devices on the network. Also, when learning data storage unit 300 is arranged in a server device or the like on a network, processor 11 can transmit learning data to learning device 200 via communication interface 15 .

 (動作)
 次に、学習データ処理装置100の動作を説明する。
(motion)
Next, the operation of the learning data processing device 100 will be described.

 図4は、学習データ処理装置100の処理動作の一例を示すフローチャートである。入出力インタフェース14を介して図示しない入力装置から、ユーザによって学習データ処理プログラムの実行が指示されると、プロセッサ11は、このフローチャートに示す動作を開始する。或いは、通信インタフェースを介して、ネットワーク上の学習装置200からの実行指示に応じて、プロセッサ11は、このフローチャートに示す動作を開始するようにしても良い。 FIG. 4 is a flowchart showing an example of the processing operation of the learning data processing device 100. FIG. When the user instructs execution of the learning data processing program from an input device (not shown) through the input/output interface 14, the processor 11 starts the operation shown in this flow chart. Alternatively, the processor 11 may start the operation shown in this flow chart in response to an execution instruction from the learning device 200 on the network via the communication interface.

 まず、プロセッサ11は、データ入力部110としての動作を実行して、学習データ記憶部300から、学習データを取得する(ステップS11)。取得した学習データは、データメモリ13の一時記憶領域13Bに保存される。学習データは、入力画像と入力画像の照明環境、教師画像と教師画像の照明環境、及び、対象領域画像を含む。 First, the processor 11 operates as the data input unit 110 to acquire learning data from the learning data storage unit 300 (step S11). The acquired learning data is stored in the temporary storage area 13B of the data memory 13 . The learning data includes the input image and the lighting environment of the input image, the teacher image and the lighting environment of the teacher image, and the target area image.

 そして、プロセッサ11は、ランダムなパラメータで、照明の影響を増やすデータ拡張を行うか決定する(ステップS12)。データ拡張を行わない場合には(ステップS12のNO)、プロセッサ11は、後述するステップS20の処理に進む。 Then, the processor 11 determines whether or not to perform data expansion to increase the influence of illumination using random parameters (step S12). If data extension is not to be performed (NO in step S12), the processor 11 proceeds to the process of step S20, which will be described later.

 これに対して、データ拡張を行う場合には(ステップS12のYES)、プロセッサ11は、輝度調整部121としての動作を実行して、まず、入力画像と対象領域画像を取得する(ステップS13)。すなわち、プロセッサ11は、一時記憶領域13Bから入力画像と対象領域画像を読み出す。構成の説明におけるデータ入力部110から輝度調整部121へ入力画像及び対象領域画像を渡すとは、このように一時記憶領域13Bへの保存及び読み出しを意味する。これは、以下の説明においても同様である。 On the other hand, if data extension is to be performed (YES in step S12), the processor 11 performs the operation as the luminance adjustment unit 121 and first acquires the input image and the target area image (step S13). . That is, the processor 11 reads out the input image and the target area image from the temporary storage area 13B. Passing the input image and the target area image from the data input unit 110 to the brightness adjustment unit 121 in the description of the configuration means saving and reading to the temporary storage area 13B in this manner. This also applies to the following description.

 図5は、入力画像Iの一例を示す図である。また、図6は、対象領域画像Mfの一例を示す図である。 FIG. 5 is a diagram showing an example of the input image I. FIG. FIG. 6 is a diagram showing an example of the target area image Mf .

 そして、プロセッサ11は、入力画像Iの全体に対して輝度調整を行い、陰影・ハイライト画像を作成する(ステップS14)。この輝度調整は、データ拡張として陰影を模した効果Aを加える場合の輝度を下げる輝度調整と、データ拡張としてハイライトを模した効果Bを加える場合の輝度を上げる輝度調整と、を含む。輝度調整手法は、例えば線形補正及びガンマ補正を使用することができ、ユーザが事前に選択する。輝度調整のパラメータγは、線形補正及びガンマ補正のどちらにおいても、効果Aの場合はγ<1.0、効果Bの場合はγ>1.0の条件で上限及び下限を事前にユーザが設定し、この範囲内からランダムに決定する。輝度調整を行った陰影・ハイライト画像Jは、以下の数1のように作成される。 Then, the processor 11 performs luminance adjustment on the entire input image I to create a shadow/highlight image (step S14). This brightness adjustment includes a brightness adjustment to decrease brightness when adding effect A simulating shadow as data extension, and a brightness adjustment increasing brightness when effect B simulating highlight is added as data extension. Brightness adjustment techniques may use, for example, linear correction and gamma correction, and are preselected by the user. For the brightness adjustment parameter γ, the user sets the upper and lower limits in advance under the condition that γ<1.0 for effect A and γ>1.0 for effect B in both linear correction and gamma correction. and randomly determined within this range. A shadow/highlight image J whose luminance has been adjusted is created as shown in Equation 1 below.

Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001

 図7は、データ拡張として陰影を模した効果Aを加える場合の輝度調整を行った陰影・ハイライト画像Jの一例を示す図である。この場合の陰影・ハイライト画像Jは、陰影画像となる。また、図8は、データ拡張としてハイライトを模した効果Bを加える場合の輝度調整を行った陰影・ハイライト画像Jの一例を示す図である。この場合の陰影・ハイライト画像Jは、ハイライト画像となる。 FIG. 7 is a diagram showing an example of a shadow/highlight image J that has undergone luminance adjustment in the case of adding an effect A imitating a shadow as data extension. The shadow/highlight image J in this case is a shadow image. Also, FIG. 8 is a diagram showing an example of a shadow/highlight image J subjected to luminance adjustment in the case of adding an effect B simulating a highlight as data extension. The shadow/highlight image J in this case is a highlight image.

 プロセッサ11は、こうして作成した陰影・ハイライト画像Jを一時記憶領域13Bに保存する。 The processor 11 stores the shadow/highlight image J thus created in the temporary storage area 13B.

 プロセッサ11は、マスク領域作成部122としての動作を実行して、マスク画像記憶部123つまりマスク画像データセット記憶領域13Aに事前に記憶されているマスク画像データセットを利用するか決定する(ステップS15)。すなわち、プロセッサ11は、事前に用意したマスク画像を用いるか、陰影・ハイライト画像Jからマスク画像を作成するかを、ランダムなパラメータにより決定する。 The processor 11 executes the operation as the mask area creating unit 122 and determines whether to use the mask image data set stored in advance in the mask image storage unit 123, that is, the mask image data set storage area 13A (step S15). ). That is, the processor 11 determines whether to use a mask image prepared in advance or to create a mask image from the shadow/highlight image J based on random parameters.

 マスク画像データセットを利用すると決定した場合(ステップS15のYES)、プロセッサ11は、マスク画像記憶部123に記憶されているマスク画像データセットから、例えばランダムなパラメータにより、マスク画像Mdを取得する(ステップS16)。図9は、取得したマスク画像Mdの一例を示す図である。プロセッサ11は、この取得したマスク画像Mdを一時記憶領域13Bに保存する。 If it is determined to use the mask image data set (YES in step S15), the processor 11 acquires the mask image Md from the mask image data set stored in the mask image storage unit 123 using, for example, random parameters. (Step S16). FIG. 9 is a diagram showing an example of the acquired mask image Md . The processor 11 stores this acquired mask image Md in the temporary storage area 13B.

 これに対して、マスク画像データセットを利用しないと決定した場合(ステップS15のNO)、プロセッサ11は、一時記憶領域13Bに保存されている陰影・ハイライト画像Jの内、データ拡張としてハイライトを模した効果Bを加える場合の輝度調整を行った陰影・ハイライト画像Jであるハイライト画像を読み出す。そして、プロセッサ11は、この陰影・ハイライト画像Jに任意の2値化処理を行うことでマスク画像Mdを作成する(ステップS17)。図10は、このハイライト画像から作成されたマスク画像Mdの一例を示す図である。プロセッサ11は、この作成したマスク画像Mdを一時記憶領域13Bに保存する。 On the other hand, if it is determined not to use the mask image data set (NO in step S15), the processor 11 selects the shadow/highlight image J stored in the temporary storage area 13B as data extension. A highlight image, which is a shadow/highlight image J subjected to luminance adjustment when applying an effect B imitating , is read. Then, the processor 11 creates a mask image Md by performing arbitrary binarization processing on the shadow/highlight image J (step S17). FIG. 10 is a diagram showing an example of a mask image Md created from this highlight image. The processor 11 stores this created mask image Md in the temporary storage area 13B.

 こうしてマスク画像Mdが取得又は作成されたならば、プロセッサ11は、一時記憶領域13Bより対象領域画像Mfとマスク画像Mdとを読み出し、それらに基づいて陰影・ハイライト付与領域画像を作成する(ステップS18)。すなわち、プロセッサ11は、以下の数2のように、対象領域画像Mfとマスク画像Mdの論理積をとり、ガウシアンフィルタg(k,σ)をかけることで、陰影・ハイライト付与領域画像Mを求める。なお、ガウシアンフィルタのパラメータは、ユーザが事前に決定する。デフォルトでは、フィルタサイズk=11とし、σ=5.0とする。そして、プロセッサ11は、作成した陰影・ハイライト付与領域画像Mを一時記憶領域13Bに保存する。 After the mask image Md is obtained or created in this way, the processor 11 reads out the target area image Mf and the mask image Md from the temporary storage area 13B, and creates a shadow/highlight application area image based on them. (step S18). That is, the processor 11 obtains the logical product of the target area image M f and the mask image M d as shown in Equation 2 below, and applies a Gaussian filter g(k, σ) to obtain a shadow/highlight added area image Ask for M. Note that the parameters of the Gaussian filter are determined in advance by the user. By default, the filter size k=11 and σ=5.0. Then, the processor 11 saves the created shaded/highlighted area image M in the temporary storage area 13B.

Figure JPOXMLDOC01-appb-M000002
Figure JPOXMLDOC01-appb-M000002

 その後、プロセッサ11は、画像合成部124としての動作を実行し、一時記憶領域13Bから入力画像I、陰影・ハイライト画像J、及び陰影・ハイライト付与領域画像Mを読み出し、それらを以下の数3のように合成して、データ拡張画像I’を作成する(ステップS19)。 After that, the processor 11 executes the operation as the image synthesizing unit 124, reads out the input image I, the shadow/highlight image J, and the shadow/highlight application area image M from the temporary storage area 13B, and converts them into the following numbers. 3 to create a data extended image I' (step S19).

Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000003

 なお、ここで読み出す陰影・ハイライト画像Jは、マスク画像Mdがマスク画像データセットから取得された場合には陰影画像、陰影・ハイライト画像Jから作成した場合には該当するハイライト画像、となる。プロセッサ11は、上記ステップS16又はステップS17において、一時記憶領域13Bに保存しておき、それを読み出すことで、何れの陰影・ハイライト画像Jを読み出すかを決定することができる。或いは、上記ステップS16又はステップS17においてマスク画像Mdを一時記憶領域13Bに保存する際に、画像合成に使用しない陰影・ハイライト画像Jを一時記憶領域13Bから消去するようにしても良い。 The shadow/highlight image J read here is a shadow image when the mask image Md is obtained from the mask image data set, a corresponding highlight image when the mask image Md is created from the shadow/highlight image J, becomes. The processor 11 can determine which shadow/highlight image J is to be read by storing it in the temporary storage area 13B in step S16 or step S17 and reading it. Alternatively, when the mask image Md is stored in the temporary storage area 13B in step S16 or S17, the shadow/highlight image J not used for image synthesis may be deleted from the temporary storage area 13B.

 図11及び図12は、数3における1-Mで示される、画像合成される反転陰影・ハイライト付与領域画像の一例を示す図である。図11は図9のマスク画像Mdに対応し、図12は図10のマスク画像Mdに対応する。 11 and 12 are diagrams showing an example of a reverse shadow/highlight imparting area image to be combined, indicated by 1-M in Equation 3. FIG. 11 corresponds to the mask image Md of FIG. 9, and FIG. 12 corresponds to the mask image Md of FIG.

 そして、プロセッサ11は、作成したデータ拡張画像I’を一時記憶領域13Bに保存する。図13及び図14は、作成されたデータ拡張画像I’の一例を示す図である。図13は、入力画像Iと陰影画像である陰影・ハイライト画像Jと図11の反転陰影・ハイライト付与領域画像1-Mとによって作成されたデータ拡張画像I’、図14は、入力画像Iとハイライト画像である陰影・ハイライト画像Jと図12の反転陰影・ハイライト付与領域画像1-Mとによって作成されたデータ拡張画像I’を示している。 Then, the processor 11 saves the created data augmented image I' in the temporary storage area 13B. 13 and 14 are diagrams showing an example of the created data augmented image I'. FIG. 13 shows a data augmented image I' created from the input image I, a shadow/highlight image J which is a shadow image, and the reversed shadow/highlight added area image 1-M in FIG. 12 shows a data augmented image I' created from I, a shadow/highlight image J which is a highlight image, and the reverse shadow/highlight imparting area image 1-M of FIG.

 そして、プロセッサ11は、データ出力部130としての動作を実行し、学習データを送信する(ステップS20)。 The processor 11 then operates as the data output unit 130 and transmits learning data (step S20).

 すなわち、上記ステップS12においてデータ拡張を行わないと判断した場合には、プロセッサ11は、一時記憶領域13Bに保存されている入力画像と教師画像とを読み出し、それらに対して正規化又は標準化を行って、再度、一時記憶領域13Bに保存する。そして、プロセッサ11は、一時記憶領域13Bから入力画像と入力画像の照明環境、及び、教師画像と教師画像の照明環境を読み出し、通信インタフェース15を介して学習装置200へ、それらを送信する。 That is, if it is determined in step S12 that data expansion is not to be performed, the processor 11 reads out the input image and teacher image stored in the temporary storage area 13B, and normalizes or standardizes them. and store it again in the temporary storage area 13B. Then, the processor 11 reads out the input image, the lighting environment of the input image, and the teacher image and the lighting environment of the teacher image from the temporary storage area 13B, and transmits them to the learning device 200 via the communication interface 15 .

 これに対して、上記ステップS12においてデータ拡張を行うと判断して、上記ステップS13乃至ステップS19の処理によりデータ拡張画像I’が生成された場合には、プロセッサ11は、一時記憶領域13Bに保存されているデータ拡張画像I’を読み出す。そして、プロセッサ11は、そのデータ拡張画像I’に対して正規化又は標準化を行い、その結果を、入力画像Iとして、既に一時記憶領域13Bに保存されている入力画像Iに上書き保存する。すなわち、プロセッサ11は、一時記憶領域13Bに保存されている入力画像Iを、正規化又は標準化を行ったデータ拡張画像I’に書き換える。また、プロセッサ11は、一時記憶領域13Bに保存されている教師画像を読み出し、それに対して正規化又は標準化を行って、再度、一時記憶領域13Bに保存する。そして、プロセッサ11は、一時記憶領域13Bから入力画像と入力画像の照明環境、及び、教師画像と教師画像の照明環境を読み出し、通信インタフェース15を介して学習装置200へ、それらを送信する。 On the other hand, when it is determined in step S12 that data extension is to be performed and the data extension image I' is generated by the processing in steps S13 to S19, the processor 11 stores the data in the temporary storage area 13B. read out the data augmented image I'. Then, the processor 11 normalizes or standardizes the data augmented image I', and saves the result as the input image I by overwriting the input image I already saved in the temporary storage area 13B. That is, the processor 11 rewrites the input image I stored in the temporary storage area 13B to the normalized or standardized data augmented image I'. The processor 11 also reads the teacher image stored in the temporary storage area 13B, normalizes or standardizes it, and stores it in the temporary storage area 13B again. Then, the processor 11 reads out the input image, the lighting environment of the input image, and the teacher image and the lighting environment of the teacher image from the temporary storage area 13B, and transmits them to the learning device 200 via the communication interface 15 .

 以上に説明した第1実施形態に係る学習データ処理装置100は、データ入力部110により、学習データ記憶部300から、入力画像I及び入力画像Iの照明環境と、入力画像Iから照明環境のみを変更した画像である教師画像及び教師画像の照明環境と、入力画像I中の輝度変更対象領域を示す対象領域画像Mfと、を含む、深層生成モデルの学習に利用する学習データを取得し、データ拡張部120によって、入力画像Iに対して輝度調整を行った輝度調整画像である陰影・ハイライト画像Jを作成すると共に、輝度変更を加える輝度変更領域を示すマスク画像Mdを取得又は作成し、対象領域画像Mf、陰影・ハイライト画像J及びマスク画像Mdを合成して、データ拡張画像I’を作成する。そして、学習データ処理装置100は、データ出力部130により、学習データ中の入力画像Iをデータ拡張画像I’に変更して新たな学習データを作成し、新たな学習データを、深層生成モデルの学習に利用する学習データとして学習装置200に出力する。 In the learning data processing apparatus 100 according to the first embodiment described above, the data input unit 110 obtains the input image I and the lighting environment of the input image I from the learning data storage unit 300, and only the lighting environment from the input image I. Acquiring learning data used for learning a deep generative model, including a teacher image that is a modified image, the lighting environment of the teacher image, and a target region image M f that indicates a brightness change target region in the input image I; The data extension unit 120 creates a shadow/highlight image J, which is a brightness adjusted image obtained by performing brightness adjustment on the input image I, and acquires or creates a mask image Md indicating a brightness change area to which brightness change is applied. Then, the target area image M f , the shadow/highlight image J, and the mask image M d are combined to create a data extension image I′. Then, the learning data processing device 100 uses the data output unit 130 to change the input image I in the learning data to the data augmented image I′ to create new learning data, and converts the new learning data into the deep generation model. It is output to the learning device 200 as learning data used for learning.

 このように、第1実施形態に係る学習データ処理装置100は、学習データに基づいてデータ拡張画像I’を作成し、このデータ拡張画像I’を含む新たな学習データを作成することで、深層生成モデルの学習に用いる学習データの個数を増やすことができる。よって、学習装置200は、不規則的な照明環境の影響が増えた学習データを用いて深層生成モデルの学習を行うことができ、陰影又はハイライトに対し頑健な深層生成モデルの学習を実現することが可能となる。 In this way, the learning data processing apparatus 100 according to the first embodiment creates the data augmented image I′ based on the learning data, and creates new learning data including the data augmented image I′, so that the deep layer The number of pieces of learning data used for learning the generative model can be increased. Therefore, the learning device 200 can learn a deep generative model using learning data with increased influence of an irregular lighting environment, and realize learning of a deep generative model that is robust against shadows or highlights. becomes possible.

 また、第1実施形態によれば、データ拡張部120は、入力画像I全体の輝度を下げることで輝度調整画像である陰影・ハイライト画像Jを作成する輝度調整部121と、対象領域画像Mf、陰影・ハイライト画像J及びマスク画像Mdを合成して、入力画像Iにおける輝度変更対象領域に対応する領域の内、輝度変更領域に相当する部分が暗くなったデータ拡張画像I’を作成する画像合成部124と、を含む。 Further, according to the first embodiment, the data extension unit 120 includes the brightness adjustment unit 121 that creates the shadow/highlight image J, which is a brightness adjusted image, by lowering the brightness of the entire input image I, and the target region image M f , the shadow/highlight image J and the mask image Md are combined to create a data augmented image I′ in which the portion corresponding to the luminance change area in the area corresponding to the luminance change target area in the input image I is darkened. and an image synthesizing unit 124 to create.

 このように、第1実施形態に係る学習データ処理装置100は、学習データに陰影を模した画像効果を加えることで、学習データ内の擬似的に照明環境のパターンを増加させることができ、学習装置200における陰影に対する頑健な深層生成モデルの学習を実現することが可能となる。 As described above, the learning data processing apparatus 100 according to the first embodiment adds an image effect simulating a shadow to the learning data, thereby increasing the pattern of the lighting environment in the learning data in a pseudo manner. It is possible to realize robust deep generative model learning for shadows in the device 200 .

 また、第1実施形態によれば、データ拡張部120は、入力画像I全体の輝度を上げることで輝度調整画像である陰影・ハイライト画像Jを作成する輝度調整部121と、対象領域画像Mf、陰影・ハイライト画像J及びマスク画像Mdを合成して、入力画像Iにおける輝度変更対象領域に対応する領域の内、輝度変更領域に相当する部分が明るくなったデータ拡張画像I’を作成する画像合成部124と、を含む。 Further, according to the first embodiment, the data extension unit 120 includes the brightness adjustment unit 121 that creates the shadow/highlight image J, which is a brightness adjustment image, by increasing the brightness of the entire input image I, and the target region image M f , the shadow/highlight image J and the mask image M d are combined to create a data augmented image I′ in which the portion corresponding to the luminance change area in the area corresponding to the luminance change target area in the input image I is brightened. and an image synthesizing unit 124 to create.

 このように、第1実施形態に係る学習データ処理装置100は、学習データにハイライトを模した画像効果を加えることで、学習データ内の擬似的に照明環境のパターンを増加させることができ、学習装置200におけるハイライトに対する頑健な深層生成モデルの学習を実現することが可能となる。 In this way, the learning data processing apparatus 100 according to the first embodiment adds an image effect simulating a highlight to the learning data, thereby increasing the number of pseudo lighting environment patterns in the learning data. It is possible to realize robust deep generative model learning for highlights in the learning device 200 .

 さらに、第1実施形態によれば、データ拡張部120は、輝度調整画像である陰影・ハイライト画像Jに任意の2値化処理を行うことでマスク画像Mdを作成するマスク領域作成部122を含む。 Furthermore, according to the first embodiment, the data expansion unit 120 includes the mask area creation unit 122 that creates the mask image Md by performing arbitrary binarization processing on the shadow/highlight image J, which is the brightness adjustment image. including.

 このように、第1実施形態に係る学習データ処理装置100は、輝度調整画像である陰影・ハイライト画像Jすなわち入力画像Iに基づいたマスク画像Mdを作成し、このマスク画像Mdを使用してデータ拡張画像I’を作成するので、入力画像Iから大きく乖離したデータ拡張画像I’が作成される確率を下げることができる。 In this way, the learning data processing apparatus 100 according to the first embodiment creates a mask image Md based on the shadow/highlight image J, that is, the input image I, which is a luminance adjustment image, and uses this mask image Md . Since the data-extended image I' is generated as a result, the probability that the data-extended image I' deviating greatly from the input image I will be generated can be reduced.

 また、第1実施形態によれば、データ拡張部120は、さらに、マスク画像記憶部123に事前に記憶された不規則的なマスク画像のデータセットであるマスク画像データセットMIDSの中からマスク画像Mdを取得するマスク領域作成部122を含む。 Further, according to the first embodiment, the data expansion unit 120 further extracts mask images from the mask image data set MIDS, which is a data set of irregular mask images stored in the mask image storage unit 123 in advance. It includes a mask region generator 122 that obtains M d .

 このように、第1実施形態に係る学習データ処理装置100は、入力画像Iに依存しない様々なマスク画像Mdを使用してデータ拡張画像I’を作成することができ、学習データの個数を容易に増加させることが可能となる。 Thus, the learning data processing apparatus 100 according to the first embodiment can create the data augmented image I' using various mask images Md independent of the input image I, and the number of learning data can be reduced to It becomes possible to increase easily.

 [第2実施形態]
 第1実施形態では、学習データに陰影を模した画像効果を加えるデータ拡張を行ったデータ拡張画像、又は、学習データにハイライトを模した画像効果を加えるデータ拡張を行ったデータ拡張画像I’を作成している。すなわち、何れか一方のデータ拡張画像I’を作成するようにしている。しかしながら、それら2種類のデータ拡張画像I’を両方とも作成するようにしても良い。
[Second embodiment]
In the first embodiment, a data-extended image obtained by adding an image effect simulating a shadow to the learning data, or a data-extended image I′ obtained by performing data extension adding an image effect simulating a highlight to the learning data. is creating That is, one of the data extension images I' is created. However, both of these two types of data extension images I' may be created.

 図15は、この発明の第2実施形態に係る学習データ処理装置の処理動作の一例を示すフローチャートである。本実施形態では、上記第1実施形態における上記ステップS15の判断処理を省略し、プロセッサ11は、上記ステップS16でのマスク画像Mdの取得と、上記ステップS17でのマスク画像Mdの作成との両方を行う。プロセッサ11がマルチスレッド数のCPUを含むとき、それらの処理は別スレッドで並行して同時に行うことができる。勿論、上記ステップS16の処理と上記ステップS17の処理とを順次に行うものとしても良い。この場合、上記ステップS16の処理を行った後、上記ステップS17の処理を行っても良いし、その逆の順序であっても良い。 FIG. 15 is a flow chart showing an example of the processing operation of the learning data processing device according to the second embodiment of the present invention. In this embodiment, the determination process of step S15 in the first embodiment is omitted, and the processor 11 obtains the mask image Md in step S16 and creates the mask image Md in step S17. do both. When the processor 11 includes multi-threaded CPUs, these processes can be performed concurrently in separate threads. Of course, the process of step S16 and the process of step S17 may be performed sequentially. In this case, after performing the process of step S16, the process of step S17 may be performed, or the order may be reversed.

 上記ステップS18の処理においては、プロセッサ11は、それぞれのマスク画像Mdを用いて2種類の陰影・ハイライト付与領域画像Mを作成し、上記ステップS19の処理において、2種類のデータ拡張画像I’を作成する。 In the process of step S18, the processor 11 creates two types of shadow/highlight addition area images M using the respective mask images M d , and in the process of step S19, two types of data extension images I ' to create.

 そして、プロセッサ11は、上記ステップS20において、それら2種類のデータ拡張画像I’を含む学習データを学習装置200へ送信する。 Then, the processor 11 transmits learning data including these two types of data augmented images I' to the learning device 200 in step S20.

 以上に説明した第2実施形態に係る学習データ処理装置100は、入力画像I全体の輝度を下げた陰影・ハイライト画像Jと、入力画像I全体の輝度を上げた陰影・ハイライト画像Jとを用いて、入力画像Iにおける輝度変更対象領域に対応する領域の内、輝度変更領域に相当する部分が暗くなったデータ拡張画像I’と、輝度変更領域に相当する部分が明るくなったデータ拡張画像I’と、を作成する。 The learning data processing apparatus 100 according to the second embodiment described above produces a shadow/highlight image J obtained by reducing the luminance of the entire input image I and a shadow/highlight image J obtained by increasing the luminance of the entire input image I. is used to obtain a data extension image I′ in which a portion corresponding to the brightness change region is darkened in the region corresponding to the brightness change target region in the input image I, and a data extension image in which a portion corresponding to the brightness change region is brightened Create an image I'.

 このように、第2実施形態に係る学習データ処理装置100は、学習データに陰影及びハイライトを模した画像効果を加えることで、学習データ内の擬似的に照明環境のパターンを増加させることができ、学習装置200における陰影及びハイライトに対する頑健な深層生成モデルの学習を実現することが可能となる。 As described above, the learning data processing apparatus 100 according to the second embodiment adds image effects simulating shadows and highlights to the learning data, thereby increasing the number of pseudo lighting environment patterns in the learning data. This makes it possible to realize robust deep generative model learning for shadows and highlights in the learning device 200 .

 [第3実施形態]
 図16は、この発明の第3実施形態に係る学習データ処理装置100を備える深層生成モデル学習システムの構成の一例を示すブロック図である。本実施形態に係る学習データ処理装置100は、第1実施形態の構成に加え、評価部140を備える。そして、データ拡張部120の画像合成部124は、作成したデータ拡張画像I’を、データ出力部130に加え、この評価部140にも渡す。
[Third embodiment]
FIG. 16 is a block diagram showing an example of the configuration of a deep generative model learning system including the learning data processing device 100 according to the third embodiment of the present invention. A learning data processing apparatus 100 according to this embodiment includes an evaluation unit 140 in addition to the configuration of the first embodiment. Then, the image synthesizing unit 124 of the data extension unit 120 adds the created data extension image I′ to the data output unit 130 and also passes it to the evaluation unit 140 .

 評価部140は、内部に異常検知モデルを持ち、データ拡張画像I’の評価を行う。ここで、異常検知モデルは、実画像から取得した陰影・ハイライトがのった画像群Aと、実画像から大きく乖離するよう恣意的に作成された陰影・ハイライトがのった画像群Bとを学習データとし、距離学習を用いて学習を行ったものである。評価部140は、データ拡張部120からデータ拡張画像I’を取得し、このデータ拡張画像I’を異常検知モデルに入力して、評価値を得る。そして、評価値がユーザの設定した閾値を超えたとき、評価部140は、データ拡張画像I’は実画像から大きく乖離した画像であるとみなし、これを破棄して、データ拡張部120に再度、データ拡張を実行させる。 The evaluation unit 140 has an internal anomaly detection model and evaluates the data augmented image I'. Here, the anomaly detection model consists of an image group A with shadows and highlights obtained from the actual image, and an image group B with shadows and highlights arbitrarily created so as to greatly deviate from the actual image. was used as learning data, and learning was performed using distance learning. The evaluation unit 140 acquires the data extension image I' from the data extension unit 120, inputs this data extension image I' to the anomaly detection model, and obtains an evaluation value. Then, when the evaluation value exceeds the threshold set by the user, the evaluation unit 140 regards the data extension image I′ as an image greatly deviating from the actual image, discards it, and returns the image to the data extension unit 120 again. , to perform data augmentation.

 図17は、第3実施形態に係る学習データ処理装置100の処理動作の一例を示すフローチャートである。すなわち、プロセッサ11は、上記ステップS19の処理に続けて、一時記憶領域13Bに保存されたデータ拡張画像I’を読み出し、事前に学習した異常検知モデルにより当該データ拡張画像I’を評価する(ステップS31)。 FIG. 17 is a flow chart showing an example of the processing operation of the learning data processing device 100 according to the third embodiment. That is, following the processing of step S19, the processor 11 reads out the data augmented image I' stored in the temporary storage area 13B, and evaluates the data augmented image I' using the anomaly detection model learned in advance (step S31).

 そして、プロセッサ11は、得られた評価値が閾値以下であるか判断する(ステップS32)。評価値が閾値以下の場合(ステップS32のYES)、データ拡張画像I’は、実画像から大きく乖離はしておらず、学習装置200の学習データに相応しいものとして、上記ステップS20の処理に進む。これにより、そのデータ拡張画像I’を含む学習データが学習装置200に送信されることとなる。 Then, the processor 11 determines whether the obtained evaluation value is equal to or less than the threshold (step S32). If the evaluation value is equal to or less than the threshold (YES in step S32), the data augmented image I' does not deviate greatly from the actual image, and is considered suitable for learning data of the learning device 200, and the process proceeds to step S20. . As a result, learning data including the data augmented image I′ is transmitted to the learning device 200 .

 これに対して、評価値が閾値以下ではない、つまり、閾値を超えている場合(ステップS32のNO)、プロセッサ11は、データ拡張画像I’は実画像から大きく乖離した画像であるとして、一時記憶領域13Bから当該データ拡張画像I’を削除して、上記ステップS13の処理から繰り返す。これにより、マスク画像を変更して新たなデータ拡張画像I’を作成することができる。 On the other hand, if the evaluation value is not equal to or less than the threshold value, that is, if it exceeds the threshold value (NO in step S32), the processor 11 temporarily determines that the data augmented image I′ is an image greatly deviating from the actual image. The data augmented image I' is deleted from the storage area 13B, and the process from step S13 is repeated. As a result, a new data augmented image I' can be created by changing the mask image.

 以上に説明した第3実施形態に係る学習データ処理装置100は、評価部140において、データ拡張部120によって作成されたデータ拡張画像I’の評価を行い、データ拡張画像I’が実画像から大きく乖離した画像である場合には、再度、データ拡張部120にデータ拡張画像I’を作成させる。 In the learning data processing device 100 according to the third embodiment described above, the evaluation unit 140 evaluates the data augmented image I′ created by the data extension unit 120, and the data augmented image I′ is larger than the actual image. If the image is a deviated image, the data extension unit 120 is caused to create the data extension image I′ again.

 このように、第3実施形態に係る学習データ処理装置100は、データ拡張画像I’の評価を行うことで、学習装置200での深層生成モデルの学習に利用するのに相応しくない学習データが作成されることを防止することが可能となる。 In this way, the learning data processing device 100 according to the third embodiment evaluates the data augmented image I′ to create learning data that is not suitable for use in learning the deep generative model in the learning device 200. It is possible to prevent it from being done.

 なお、前述した第2実施形態に係る学習データ処理装置100においても、この第3実施形態のような評価部140を追加しても良いことは言うまでも無い。 It goes without saying that the learning data processing device 100 according to the second embodiment described above may also include the evaluation unit 140 as in the third embodiment.

 [他の実施形態]
 前述した第1及び第3実施形態では、輝度調整部121において、陰影・ハイライト画像Jとして陰影画像とハイライト画像の両方を作成し、マスク領域作成部122にてランダムに何れかを選択的に使用するものとしている。このようにする代わりに、輝度調整部121においてランダムに何れかの画像を生成し、マスク領域作成部122では、輝度調整部121で生成された陰影・ハイライト画像Jに応じたマスク画像を使用するものとしても良い。
[Other embodiments]
In the first and third embodiments described above, the brightness adjustment unit 121 creates both a shadow image and a highlight image as the shadow/highlight image J, and the mask area creation unit 122 randomly selects one of them. shall be used for Instead of doing so, the brightness adjustment unit 121 randomly generates any image, and the mask area creation unit 122 uses a mask image corresponding to the shadow/highlight image J generated by the brightness adjustment unit 121. It is good to do.

 また、第1実施形態では1種類のデータ拡張画像I’を1つ作成し、第2実施形態では2種類のデータ拡張画像I’を1つずつ作成するものとしたが、マスク画像を複数取得又は作成することで、作成するデータ拡張画像I’の個数を増加させても良い。 In the first embodiment, one data augmented image I' of one type is created, and in the second embodiment, two types of data augmented image I' are created one by one. Alternatively, the number of data extension images I' to be created may be increased by creating them.

 なお、第1及び第3実施形態では、上記ステップS19においてデータ拡張画像I’を作成する際に、マスク画像Mdがマスク画像データセットから取得されていた場合には、陰影・ハイライト画像Jとして陰影画像を用いるとしたが、そのような場合にも陰影・ハイライト画像Jとしてハイライト画像を用いても良い。陰影・ハイライト画像Jとして何れの画像を用いるかはランダムなパラメータで決定するようにしても良いし、両方とも用いて2種類のデータ拡張画像I’を作成するようにしても構わない。 In the first and third embodiments, when the data augmented image I' is created in step S19, if the mask image Md has been obtained from the mask image data set, the shadow/highlight image J However, the highlight image may be used as the shadow/highlight image J in such a case as well. Which image to use as the shadow/highlight image J may be determined by a random parameter, or both may be used to create two types of data extension images I'.

 また、学習データ記憶部300は、学習データ処理装置100の一部として構成されても良い。すなわち、データメモリ13に、学習データ記憶部300としての記憶領域が設けられていても良い。 Also, the learning data storage unit 300 may be configured as part of the learning data processing device 100 . That is, the data memory 13 may be provided with a storage area as the learning data storage unit 300 .

 さらに、学習装置200に、実施形態の学習データ処理装置100の機能を組み込んでも構わない。 Furthermore, the learning device 200 may incorporate the functions of the learning data processing device 100 of the embodiment.

 また、各実施形態に記載した手法は、計算機(コンピュータ)に実行させることができるプログラム(ソフトウェア手段)として、例えば磁気ディスク(フロッピー(登録商標)ディスク、ハードディスク等)、光ディスク(CD-ROM、DVD、MO等)、半導体メモリ(ROM、RAM、フラッシュメモリ等)等の記録媒体に格納し、また通信媒体により伝送して頒布することもできる。なお、媒体側に格納されるプログラムには、計算機に実行させるソフトウェア手段(実行プログラムのみならずテーブル、データ構造も含む)を計算機内に構成させる設定プログラムをも含む。本装置を実現する計算機は、記録媒体に記録されたプログラムを読み込み、また場合により設定プログラムによりソフトウェア手段を構築し、このソフトウェア手段によって動作が制御されることにより上述した処理を実行する。なお、本明細書で言う記録媒体は、頒布用に限らず、計算機内部或いはネットワークを介して接続される機器に設けられた磁気ディスク、半導体メモリ等の記憶媒体を含むものである。 Further, the method described in each embodiment can be executed by a computer (computer) as a program (software means), such as a magnetic disk (floppy (registered trademark) disk, hard disk, etc.), an optical disk (CD-ROM, DVD , MO, etc.), a semiconductor memory (ROM, RAM, flash memory, etc.), or the like, or may be transmitted and distributed via a communication medium. The programs stored on the medium also include a setting program for configuring software means (including not only execution programs but also tables and data structures) to be executed by the computer. A computer that realizes this apparatus reads a program recorded on a recording medium, and in some cases, builds software means by a setting program, and executes the above-described processes by controlling the operation by this software means. The term "recording medium" as used in this specification includes not only those for distribution, but also storage media such as magnetic disks, semiconductor memories, etc. provided in computers or devices connected via a network.

 要するに、この発明は上記実施形態に限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で種々に変形することが可能である。また、各実施形態は可能な限り適宜組み合わせて実施しても良く、その場合組み合わせた効果が得られる。さらに、上記実施形態には種々の段階の発明が含まれており、開示される複数の構成要件における適当な組み合わせにより種々の発明が抽出され得る。 In short, the present invention is not limited to the above embodiments, and can be modified in various ways without departing from the gist of the invention at the implementation stage. Moreover, each embodiment may be implemented in combination as much as possible, in which case the combined effect can be obtained. Furthermore, the above-described embodiments include inventions at various stages, and various inventions can be extracted by appropriately combining a plurality of disclosed constituent elements.

 11…プロセッサ
 12…プログラムメモリ
 13…データメモリ
 13A…マスク画像データセット記憶領域
 13B…一時記憶領域
 14…入出力インタフェース
 15…通信インタフェース
 16…バス
 100…学習データ処理装置
 110…データ入力部
 120…データ拡張部
 121…輝度調整部
 122…マスク領域作成部
 123…マスク画像記憶部
 124…画像合成部
 130…データ出力部
 140…評価部
 200…学習装置
 300…学習データ記憶部
 I…入力画像
 I’…データ拡張画像
 J…陰影・ハイライト画像
 Md…マスク画像
 Mf…対象領域画像
 MIDS…マスク画像データセット
 1-M…反転陰影・ハイライト付与領域画像

 
DESCRIPTION OF SYMBOLS 11... Processor 12... Program memory 13... Data memory 13A... Mask image data set storage area 13B... Temporary storage area 14... Input/output interface 15... Communication interface 16... Bus 100... Learning data processing device 110... Data input unit 120... Data Extension unit 121 Luminance adjustment unit 122 Mask area creation unit 123 Mask image storage unit 124 Image synthesizing unit 130 Data output unit 140 Evaluation unit 200 Learning device 300 Learning data storage unit I Input image I' Data extension image J...Shade/highlight image Md...Mask image Mf ...Target area image MIDS ...Mask image data set 1-M...Inverted shadow/highlight added area image

Claims (8)

 入力画像及び前記入力画像の照明環境と、前記入力画像から照明環境のみを変更した画像である教師画像及び前記教師画像の照明環境と、前記入力画像中の輝度変更対象領域を示す対象領域画像と、を含む、深層生成モデルの学習に利用する学習データを取得するデータ入力部と、
 前記入力画像に対して輝度調整を行った輝度調整画像を作成すると共に、輝度変更を加える輝度変更領域を示すマスク画像を作成し、前記対象領域画像、前記輝度調整画像及び前記マスク画像を合成して、データ拡張画像を作成するデータ拡張部と、
 前記学習データ中の前記入力画像を前記データ拡張画像に変更して新たな学習データを作成し、前記新たな学習データを、前記深層生成モデルの学習に利用する前記学習データとして出力するデータ出力部と、
 を具備する、学習データ処理装置。
an input image and the illumination environment of the input image; a teacher image that is an image obtained by changing only the illumination environment from the input image; the illumination environment of the teacher image; a data input unit that acquires learning data used for learning the deep generative model, including
creating a brightness-adjusted image obtained by performing brightness adjustment on the input image, creating a mask image indicating a brightness-changed region to which brightness is to be changed, and synthesizing the target region image, the brightness-adjusted image, and the mask image; a data augmentation unit for creating a data augmentation image;
A data output unit that creates new learning data by changing the input image in the learning data to the data augmentation image, and outputs the new learning data as the learning data used for learning the deep generative model. When,
A learning data processing device comprising:
 前記データ拡張部は、前記入力画像全体の輝度を下げることで前記輝度調整画像を作成する輝度調整部と、前記対象領域画像、前記輝度調整画像及び前記マスク画像を合成して、前記入力画像における前記輝度変更対象領域に対応する領域の内、前記輝度変更領域に相当する部分が暗くなった前記データ拡張画像を作成する画像合成部と、を含む、請求項1に記載の学習データ処理装置。 The data extension unit combines a brightness adjustment unit that creates the brightness adjustment image by lowering the brightness of the entire input image, and combines the target area image, the brightness adjustment image, and the mask image to obtain the 2. The learning data processing device according to claim 1, further comprising an image synthesizing unit that creates the data augmented image in which a portion corresponding to the brightness change area in the area corresponding to the brightness change target area is darkened.  前記データ拡張部は、前記入力画像全体の輝度を上げることで前記輝度調整画像を作成する輝度調整部と、前記対象領域画像、前記輝度調整画像及び前記マスク画像を合成して、前記入力画像における前記輝度変更対象領域に対応する領域の内、前記輝度変更領域に相当する部分が明るくなった前記データ拡張画像を作成する画像合成部と、を含む、請求項1に記載の学習データ処理装置。 The data extension unit synthesizes the brightness adjustment unit that creates the brightness adjustment image by increasing the brightness of the entire input image, and the target area image, the brightness adjustment image, and the mask image, 2. The learning data processing device according to claim 1, further comprising an image synthesizing unit that creates the data augmented image in which a portion corresponding to the brightness change area in the area corresponding to the brightness change target area is brightened.  前記データ拡張部は、前記輝度調整画像に任意の2値化処理を行うことで前記マスク画像を作成するマスク領域作成部をさらに含む、請求項3に記載の学習データ処理装置。 The learning data processing device according to claim 3, wherein the data extension unit further includes a mask area creation unit that creates the mask image by performing arbitrary binarization processing on the brightness adjustment image.  前記データ拡張部は、事前に記憶された不規則的なマスク画像のデータセットであるマスク画像データセットの中から前記マスク画像を取得するマスク領域作成部をさらに含む、請求項2又は3に記載の学習データ処理装置。 4. The data extension unit according to claim 2 or 3, further comprising a mask region creation unit that obtains the mask image from a mask image data set, which is a pre-stored data set of irregular mask images. learning data processor.  前記データ拡張画像の評価を行い、前記データ拡張画像が実画像から大きく乖離した画像である場合には、前記データ拡張部に再度前記データ拡張画像を作成させる評価部をさらに具備する、請求項1乃至5の何れかに記載の学習データ処理装置。 2. An evaluation unit that evaluates the data extension image and causes the data extension unit to create the data extension image again when the data extension image is an image greatly deviated from the actual image. 6. The learning data processing device according to any one of 1 to 5.  プロセッサを有し、深層生成モデルの学習に利用する学習データから新たな学習データを作成する学習データ処理装置における学習データ処理方法であって、
 前記プロセッサが、入力画像及び前記入力画像の照明環境と、前記入力画像から照明環境のみを変更した画像である教師画像及び前記教師画像の照明環境と、前記入力画像中の輝度変更対象領域を示す対象領域画像と、を含む、前記学習データを取得し、
 前記プロセッサが、前記入力画像に対して輝度調整を行った輝度調整画像を作成すると共に、輝度変更を加える輝度変更領域を示すマスク画像を作成し、前記対象領域画像、前記輝度調整画像及び前記マスク画像を合成して、データ拡張画像を作成し、
 前記プロセッサが、前記学習データ中の前記入力画像を前記データ拡張画像に変更して新たな学習データを作成し、前記新たな学習データを、前記深層生成モデルの学習に利用する前記学習データとして出力する、
 学習データ処理方法。
A learning data processing method in a learning data processing device having a processor and creating new learning data from learning data used for learning a deep generative model,
The processor indicates an input image, an illumination environment of the input image, a teacher image that is an image obtained by changing only the illumination environment from the input image, an illumination environment of the teacher image, and a brightness change target area in the input image. obtaining the training data, including an image of the target area;
The processor creates a brightness-adjusted image obtained by performing brightness adjustment on the input image, creates a mask image indicating a brightness-changed region to which brightness is to be changed, and produces the target region image, the brightness-adjusted image, and the mask. Combining images to create a data augmented image,
The processor changes the input image in the learning data to the data augmentation image to create new learning data, and outputs the new learning data as the learning data used for learning the deep generative model. do,
Learning data processing method.
 請求項1乃至6の何れかに記載の学習データ処理装置の前記各部としてプロセッサを機能させる学習データ処理プログラム。 A learning data processing program that causes a processor to function as each part of the learning data processing device according to any one of claims 1 to 6.
PCT/JP2021/014159 2021-04-01 2021-04-01 Training data processing device, training data processing method, and training data processing program Ceased WO2022208843A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2023510112A JPWO2022208843A1 (en) 2021-04-01 2021-04-01
PCT/JP2021/014159 WO2022208843A1 (en) 2021-04-01 2021-04-01 Training data processing device, training data processing method, and training data processing program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/014159 WO2022208843A1 (en) 2021-04-01 2021-04-01 Training data processing device, training data processing method, and training data processing program

Publications (1)

Publication Number Publication Date
WO2022208843A1 true WO2022208843A1 (en) 2022-10-06

Family

ID=83458289

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/014159 Ceased WO2022208843A1 (en) 2021-04-01 2021-04-01 Training data processing device, training data processing method, and training data processing program

Country Status (2)

Country Link
JP (1) JPWO2022208843A1 (en)
WO (1) WO2022208843A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2024143813A (en) * 2023-03-30 2024-10-11 横河電機株式会社 Apparatus, method and program

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2018161692A (en) * 2017-03-24 2018-10-18 キヤノン株式会社 Information processing apparatus, information processing method, and program

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2018161692A (en) * 2017-03-24 2018-10-18 キヤノン株式会社 Information processing apparatus, information processing method, and program

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2024143813A (en) * 2023-03-30 2024-10-11 横河電機株式会社 Apparatus, method and program

Also Published As

Publication number Publication date
JPWO2022208843A1 (en) 2022-10-06

Similar Documents

Publication Publication Date Title
Chintha et al. Recurrent convolutional structures for audio spoof and video deepfake detection
JP6657137B2 (en) Information processing apparatus, information processing method, and program
US20200167161A1 (en) Synthetic depth image generation from cad data using generative adversarial neural networks for enhancement
CN113344777B (en) Face swapping and replaying method and device based on 3D face decomposition
JP6778670B2 (en) Information processing device, information processing method, and program
CN109902018B (en) Method for acquiring test case of intelligent driving system
US20170262623A1 (en) Physics-based captcha
KR20220098218A (en) Image-to-image transformation using unpaired data for supervised learning
CN107274358A (en) Image Super-resolution recovery technology based on cGAN algorithms
CN111310156B (en) Automatic identification method and system for slider verification code
US11170203B2 (en) Training data generation method for human facial recognition and data generation apparatus
CN116664422B (en) Image highlight processing method, device, electronic device and readable storage medium
WO2022208843A1 (en) Training data processing device, training data processing method, and training data processing program
CN114090968A (en) Ownership verification method and device for data set
CN113343951A (en) Face recognition countermeasure sample generation method and related equipment
KR20220135890A (en) Method and system for collecting virtual environment-based data for artificial intelligence object recognition model
CN110070017B (en) A method and device for generating a false-eye image of a human face
Dy et al. MCGAN: mask controlled generative adversarial network for image retargeting
CN114282651B (en) Training methods and systems for optical flow prediction models and video generation methods and systems
JP2023030207A (en) LEARNING DEVICE, LEARNING METHOD AND LEARNING PROGRAM
CN115861044B (en) Complex cloud background simulation method, device and equipment based on generative confrontation network
US12347110B2 (en) System and method for synthetic data generation using dead leaves images
CN111915533A (en) A high-precision image information extraction method based on low dynamic range
JP2021056542A (en) Pose detection of object from image data
Jiang Addressing Vulnerabilities in AI-Image Detection: Challenges and Proposed Solutions

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21935005

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023510112

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21935005

Country of ref document: EP

Kind code of ref document: A1