[go: up one dir, main page]

WO2019116494A1 - Dispositif d'apprentissage, procédé d'apprentissage, procédé de tri et support d'enregistrement - Google Patents

Dispositif d'apprentissage, procédé d'apprentissage, procédé de tri et support d'enregistrement Download PDF

Info

Publication number
WO2019116494A1
WO2019116494A1 PCT/JP2017/044894 JP2017044894W WO2019116494A1 WO 2019116494 A1 WO2019116494 A1 WO 2019116494A1 JP 2017044894 W JP2017044894 W JP 2017044894W WO 2019116494 A1 WO2019116494 A1 WO 2019116494A1
Authority
WO
WIPO (PCT)
Prior art keywords
classification
data
conversion
parameter
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2017/044894
Other languages
English (en)
Japanese (ja)
Inventor
和俊 鷺
貴裕 戸泉
裕三 仙田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to PCT/JP2017/044894 priority Critical patent/WO2019116494A1/fr
Priority to JP2019559490A priority patent/JP7184801B2/ja
Priority to EP17934746.3A priority patent/EP3726463B1/fr
Priority to US16/772,035 priority patent/US11270163B2/en
Publication of WO2019116494A1 publication Critical patent/WO2019116494A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Definitions

  • the present disclosure relates to a computer-implemented learning technique.
  • a method using an auto encoder is well known as a method of deriving a variable that well represents an object feature from input data.
  • a typical auto encoder consists of an input layer, an intermediate layer, and an output layer.
  • a typical auto-encoder is a weight used for encoding (ie, conversion of data in the input layer to data in the middle layer) based on comparison of data input to the input layer and data output by the output layer And biases, and weights and biases used for decoding (i.e., conversion of data in the intermediate layer to data in the output layer) to optimal values.
  • the data output in the intermediate layer by encoding using the weight and the bias determined as a result of learning by the auto encoder can be regarded as information which well represents the feature of the object.
  • the data output in this intermediate layer is generally referred to as "feature vector”, “feature vector”, or simply “feature” or “feature”.
  • data output in the middle layer is also referred to as “a set of values of latent variables” or “a latent variable vector”.
  • Patent Document 1 is a document that describes a technology related to the present invention.
  • Patent Document 1 discloses an image processing apparatus which converts (in other words, normalizes) a size, an angle of rotation, a position, and the like of an object in an image into a state suitable for identification.
  • the magnitude of the transformation for normalization is a portion including a vector (mapping vector) when data after roughening the image is mapped to space F by non-linear transformation, and a basis vector representing the feature of the learning sample It is determined by a factor that is determined based on the relationship with space.
  • Feature quantity vectors derived by a neural network optimized by a general auto-encoder do not necessarily appear to be related to one another for the same object in a different manner. Therefore, for example, using a feature quantity vector derived by a neural network optimized by a general auto encoder, a classifier for classifying a chair reflected in an image into a chair is photographed in the orientation shown in FIG. 1A. It is assumed that the generated image of the chair is generated by learning using data for learning. In such a case, the generated classifier may not be able to distinguish a chair taken at the orientation shown in FIG. 1B or the angle shown in FIG. 1C from the chair. This is because feature amount vectors that are unrelated to each other can be derived from data recorded in different aspects (directions and angles in the above example) even with the same object.
  • Patent Document 1 The technology described in Patent Document 1 is a technology that improves the performance of identification for an object that can take various aspects by normalizing an image.
  • a function for performing this normalization needs to be derived by learning using an image in which an object is captured in various modes as learning data.
  • the pattern identification unit 100 for identifying an object targets a normalized image, there is no guarantee that the object that is not included in the learning data can be correctly identified.
  • An object of the present invention is to provide a learning device capable of generating a discriminator capable of identifying various aspects of an object even when the object has few samples of recorded data.
  • a learning apparatus is a method in which, from data in which identical objects in different aspects are recorded, feature quantities that can be mutually transformed by transformation using transformation parameters that take values according to differences in the aspects.
  • Acquisition means for acquiring a first feature quantity derived from data in which an identification target is recorded by an encoder configured to derive, and a value of the conversion parameter for the first feature quantity
  • Parameter updating means for updating the value of the classification parameter used for classification by the classification means so as to output a result indicating the class associated with the object as the classification destination.
  • a learning method is a method of converting feature amounts that can be mutually converted by conversion using data that takes a value corresponding to the difference between the data in which the same object in different aspects is recorded.
  • An encoder configured to derive, obtains a first feature amount derived from data in which an identification target is recorded, and performs conversion using the value of the conversion parameter for the first feature amount.
  • To generate a second feature quantity, and the classification unit configured to perform classification with the feature quantity as an input when the second feature quantity is an input, the class associated with the identification target is The value of the classification parameter used for classification by the classification means is updated so as to output the result indicated as the classification destination.
  • a storage medium is, from data in which the same objects in different aspects are respectively recorded, feature amounts that can be mutually transformed by transformation using transformation parameters that take values according to differences in the aspects.
  • Acquisition processing for acquiring a first feature value derived from data in which an identification target is recorded by an encoder configured to be derived;
  • a classification unit configured to perform conversion using a value of the conversion parameter on the first feature amount to generate a second feature amount and classification using the feature amount as an input; Updating the value of the classification parameter used for classification by the classification unit so as to output a result indicating the class associated with the identification target as the classification destination when the second feature quantity is input And storing a program that causes a computer to execute an update process.
  • the storage medium is, for example, a computer readable non-transitory storage medium.
  • FIG. 2 is a block diagram showing the configuration of the learning device 31 according to the first embodiment.
  • the learning device 31 performs two learnings, that is, learning of variable derivation and learning of classification.
  • a unit related to learning of variable derivation is referred to as a variable derivation unit 110
  • a unit that performs classification learning is referred to as a classification learning unit 310.
  • variable derivation unit 110 the configuration and operation of the variable derivation unit 110 will be described first.
  • variable derivation unit 110 includes a data acquisition unit 111, an encoder 112, a conversion unit 113, a decoder 114, a parameter update unit 115, and a parameter storage unit 119.
  • the data acquisition unit 111, the encoder 112, the conversion unit 113, the decoder 114, and the parameter update unit 115 are realized by, for example, one or more CPUs (Central Processing Units) that execute a program.
  • CPUs Central Processing Units
  • the parameter storage unit 119 is, for example, a memory.
  • the parameter storage unit 119 may be an auxiliary storage device such as a hard disk.
  • the parameter storage unit 119 may be external to the learning device 31 and configured to be able to communicate with the learning device 31 by wire or wirelessly.
  • the parameter storage unit 119 stores parameters used in the conversion performed by the encoder 112 and parameters used in the conversion performed by the decoder 114.
  • variable derivation unit 110 may include a storage device for temporarily or non-temporarily storing data.
  • the data used by the variable derivation unit 110 is input data, correct answer data, and difference information indicating the relationship between the input data and the correct answer data.
  • the input data is data in which the target of learning by the variable derivation unit 110 is recorded.
  • an optical image is assumed as an example of input data.
  • An example of input data other than the optical image will be described in the item of “Supplement”.
  • the input data is an optical image
  • the input data is an image showing an object (for example, an object, a person, etc.).
  • the input data is, for example, a vector whose component is the pixel value of each pixel of the image.
  • the size of the image may be any size.
  • the pixel value may be an integer value of 0 to 255, a binary value of 0 or 1, or a floating point number.
  • the type of color may be one or two or more. When there are multiple types of color, the number of components of input data increases in proportion to the number of types. Examples of input data include RGB images, multispectral images, hyperspectral images, and the like.
  • the data acquisition unit 111 acquires input data, for example, by receiving it from a storage device inside or outside the learning device 31.
  • the learning device 31 may include a device such as a camera capable of acquiring input data, and the data acquisition unit 111 may receive input data from the device.
  • the correct answer data is data used in learning of variable derivation, specifically, in updating of the value of the parameter by the parameter updating unit 115 described later.
  • the correct answer data is data in which an object indicated by the input data is recorded.
  • the at least one correct answer data is data in which an object indicated by the input data is recorded in a mode different from the mode in the input data.
  • the mode may be paraphrased as “shooting” or “looking”. Examples of aspects in the image include orientation, angle, posture, size, distortion, hue, sharpness, and the like.
  • the aspect which may differ between input data and correct answer data is defined beforehand. That is, the variable derivation unit 110 handles a set of input data and correct data, which differs in at least one specific aspect.
  • the learning device 31 may treat the input data as one of the correct data.
  • the data acquisition unit 111 acquires correct data, for example, by receiving it from a storage device inside or outside the learning device 31.
  • the learning device 31 may include a device such as a camera capable of acquiring correct data, and the data acquisition unit 111 may receive correct data from the device.
  • the data acquisition unit 111 may generate correct data by processing input data.
  • the data acquisition unit 111 can generate correct data by processing input data by using processing that changes the rotation angle of an object, and known techniques that change color tone or sharpness.
  • the difference information is information indicating the relationship between input data and correct data. Specifically, the difference information indicates the difference between the aspect of the object indicated by the input data and the aspect of the object indicated by the correct data.
  • the difference information may be represented by a parameter indicating, for example, whether there is a difference or how much the difference is.
  • the input data is an image of a chair
  • the correct data is an image captured in a direction different from the direction in the input data.
  • An example of a set of input data and correct answer data is a set of the image of FIG. 1A and the image of FIG. 1B, or a set of the image of FIG. 1A and the image of FIG. 1C.
  • An example of difference information indicating the relationship between the image of FIG. 1A and the image of FIG. 1B is a value indicating the angle of rotation (such as “+60 (degrees)”).
  • An example of the difference information indicating the relationship between the image of FIG. 1A and the image of FIG. 1C is a value indicating a change in azimuth (such as “ ⁇ 20 (degree)”).
  • an example of the difference indicated by the difference information includes, for example, an angle of rotation around a direction perpendicular to the display surface of the image, an angle Differences in the orientation of the object relative to the device, differences in brightness (or decrease) in brightness, differences in contrast, differences in noise (noise due to the presence of rain and fog, or noise from low resolution), and , Differences in the presence or absence of obstacles or accessories or decorations, etc.
  • the information indicating the strength of the wind may be the difference information.
  • parameters having a strong relationship with the above-mentioned examples may be adopted as the difference information.
  • the aspect to be indicated by the adopted difference information does not have to be an aspect capable of expressing a change by processing the input data.
  • the difference information may be a quantitative parameter or a parameter having a plurality of stages.
  • the difference information may be represented by four types of values: “not falling”, “weak”, “somewhat strong”, and “strong”.
  • the difference information may be a parameter that takes only two values (eg, "present” and "absent”).
  • the data acquisition unit 111 acquires difference information, for example, by receiving it from a storage device inside or outside the learning device 31.
  • the data acquisition unit 111 may receive input of difference information from a person or an apparatus that grasps the relationship between input data and correct answer data, and may acquire the input difference information.
  • the data acquisition unit 111 may acquire the difference information by specifying the difference by comparing the input data and the correct data.
  • the encoder 112 inputs input data to an input layer of the neural network, for example, using a neural network, and derives n values as an output.
  • n is the number of units in the output layer of the neural network.
  • This set of n values is referred to in the present disclosure as a set of values of latent variables, or a latent variable vector.
  • the latent variable vector is not limited to a one-dimensional array of multiple values.
  • the number of values to be output may be one.
  • the latent variable vector may be a two or more dimensional array.
  • the structure of the neural network used by the encoder 112 can be freely designed.
  • the number of layers, the number of components in each layer, and the manner of connection between the components are not limited.
  • the encoder 112 may use a convolutional neural network consisting of an input layer having 784 components, an intermediate layer having 512 components, and an output layer having 144 components.
  • the number of values output by the encoder 112 ie, the number of components of the latent variable vector
  • the number of values output by the encoder 112 may be configured to be equal to or larger than the number of components in the input data.
  • the activation function used in the neural network used by the encoder 112 may be any activation function.
  • the activation function include an identity function, a sigmoid function, a ReLU (Rectified Linear Unit) function, a hyperbolic tangent function, and the like.
  • the encoder 112 reads parameters (typically, weights and biases) in the neural network to be used from the parameter storage unit 119 and encodes input data.
  • parameters typically, weights and biases
  • the conversion of the latent variable vector by the conversion unit 113 is referred to as variable conversion in the present disclosure.
  • the transformation unit 113 transforms the latent variable vector using a transformation function.
  • the conversion unit 113 uses different conversion functions according to the above-described difference information.
  • the conversion unit 113 uses a conversion function using a conversion parameter that takes a value that may differ according to the difference information. After determining the conversion parameter according to the difference information, the conversion unit 113 may convert the latent variable vector using the conversion function using the determined conversion parameter.
  • An example of the transformation function is a function that changes the arrangement of components of the latent variable vector.
  • the conversion function is a function that shifts the arrangement of components of the latent variable vector. The amount to shift may be determined by the conversion parameters.
  • the operation of shifting the arrangement of the components of the vector whose component number is n by k means to move the 1st to nkth components of the vector from the (k + 1) th to the nth, and the nk to nth components. Is an operation to move the components that are in the range from the first to the k-th.
  • the conversion function is a function that shifts the arrangement of the components of the latent variable vector having the number of components 144 based on the value of the conversion parameter p.
  • the difference information acquired by the data acquisition unit 111 is the rotation angle ⁇ , and ⁇ is a multiple of 5 among integers of 0 or more and less than 360.
  • a value obtained by dividing ⁇ by 5 may be defined as the conversion parameter p.
  • p is a parameter that can take an integer value ranging from 0 to 71.
  • the conversion function may be defined such that a value twice as large as p corresponds to an amount for shifting the arrangement of the components of the latent variable vector.
  • the value of the conversion parameter p corresponding to a rotation of 40 degrees is 8, which corresponds to shifting the arrangement of components of the latent variable vector by 16.
  • the transformation function that shifts the arrangement of the components of the latent variable vector can be expressed, for example, as a multiplication of a transformation matrix representing a shift.
  • Latent variable vector Z 0, the number of components n of the latent variable vector, the value of the conversion parameter k, when the transformation matrix representing the shift and S k, S k is a matrix of n ⁇ n, the conversion function following It is expressed by the equation of F (k, Z 0 ) S k ⁇ Z 0
  • the matrix S k is a matrix shown in FIG.
  • the matrix S k is the (n-kr + j) -th row for the i-th row
  • the kr + i-th column has a value of 1 for i where 1 ⁇ i ⁇ n ⁇ kr, and 1 ⁇ j ⁇ kr.
  • the j-th column is a matrix whose numerical value is 1 and whose other numerical values are 0.
  • kr is a value obtained by k ⁇ n / N (k), where N (k) is the number of values that k can take.
  • the transformation by the transformation unit 113 generates a new latent variable vector whose number of components is n.
  • the conversion unit 113 instead of the matrix S k, the matrix generated by applying a Gaussian filter to the matrix S k may be used.
  • variable conversion may be subtraction processing of component values in which the amount of subtraction increases according to the size of the difference indicated by the difference information.
  • the smoothing process may be executed a number of times according to the size of the difference indicated by the difference information.
  • the variable conversion is an operation on a predetermined component, and the content of the operation or the number of components subjected to the operation may depend on the size of the difference indicated by the difference information.
  • variable conversion performed by the conversion unit 113 may include identity conversion.
  • variable transformation in the case where the difference information indicates that there is no difference may be identity transformation.
  • the conversion unit 113 may perform variable conversion based on the difference information according to each mode.
  • the difference information is represented by two parameters ( ⁇ , ⁇ ) indicating a three-dimensional change in orientation
  • the conversion unit 113 applies an ⁇ -dependent conversion function to the latent variable vector.
  • may be applied to generate a new latent variable vector.
  • a conversion function dependent on ⁇ and a conversion function dependent on ⁇ may be applied in parallel.
  • the conversion unit 113 may determine one conversion function based on the respective difference information of the differences between two or more types of aspects, and may execute variable conversion using the conversion function.
  • the decoder 114 inputs the latent variable vector to the input layer of the neural network using, for example, a neural network (different from the neural network used by the encoder 112), and generates output data consisting of m components as an output Do.
  • m is the number of units of the output layer of the neural network used by the decoder 114. This m is set to the same value as the number of components of the correct data. If the input data and the correct data are data expressed in the same format, m matches the number of components of the input data, that is, the number of units of the input layer of the encoder 112. Generating output data from latent variable vectors by neural networks is also called decoding.
  • the structure of the neural network used by the decoder 114 can be freely designed. For example, there is no limitation on the number of layers, the number of components in the middle layer (if it is a multilayer neural network), and the way in which the components are connected. As an example, the decoder 114 may use a neural network consisting of an input layer with 144 components, an intermediate layer with 512 components, and an output layer with 784 components.
  • the activation function used in the neural network used by the decoder 114 may be any activation function.
  • activation functions include identity functions, sigmoid functions, ReLU functions, hyperbolic tangent functions, and the like.
  • the decoder 114 reads values of parameters (typically, weights and biases) in the neural network to be used from the parameter storage unit 119 and decodes the latent variable vector.
  • parameters typically, weights and biases
  • the parameter updating unit 115 calculates an error of output data with respect to correct data, for one or more sets of correct data and output data.
  • the parameter updating unit 115 may use, for example, a mean square error as an error function for obtaining an error.
  • the parameter updating unit 115 determines the value of the new parameter so that the calculated error is smaller.
  • the method for determining the value of the new parameter may be a method adopted in a general auto encoder, which is known as a method of optimization of the value of the parameter.
  • the parameter updating unit 115 may calculate the gradient using an error back propagation method and determine the value of the parameter using Stochastic Gradient Decent (SGD). Other methods that can be adopted include "RMSprop", "Adagrad”, “Adadelta", "Adam” and the like.
  • the parameter updating unit 115 records the determined new parameter value in the parameter storage unit 119. Thereafter, the encoder 112 and the decoder 114 use the values of the new parameters.
  • the above is the specific procedure of updating.
  • the values of the parameters to be updated by the parameter updating unit 115 are the weights and biases of the neural network used by the encoder 112, and the weights and biases of the neural network used by the decoder 114.
  • the conversion parameter used for variable conversion is not included in the parameter to be updated by the parameter updating unit 115.
  • the parameter updating unit 115 may repeatedly update the value of the parameter a predetermined number of times.
  • the predetermined number of times may be determined, for example, as a value received from the user of the learning device 31 via the input interface, a numerical value indicating the predetermined number of times.
  • An error function used by the parameter updating unit 115 to determine an error can be freely designed.
  • the parameter updating unit 115 may use an error function taking into account the values of the mean and variance of the latent variable vector, such as an error function used in VAE (variational auto encoder).
  • the neural network of the encoder 112 First, from input data having m data values (x 1 , x 2 ,..., X m ) as components, the neural network of the encoder 112 generates n components (z 1 , z 2 ,. , Z n ) are derived.
  • the latent variable vector is converted into another latent variable vector having n components (z ′ 1 , z ′ 2 ,..., Z ′ n) by variable conversion by the conversion unit 113. From this other latent variable vector, output data having m components (y ′ 1 , y ′ 2 ,..., Y ′ m ) is generated by the neural network of the decoder 114.
  • each process included in the process related to learning of variable derivation may be performed in the order of instructions in the program when the process is performed by a device that executes the program.
  • the next process may be performed by notifying the device which has completed the process to the device that executes the next process.
  • each unit that performs processing records, for example, data generated by the respective processing in a storage area included in the learning device 31 or an external storage device.
  • each unit that performs processing may receive data necessary for the respective processing from the unit that generated the data or may read the data from the storage area included in the learning device 31 or an external storage device.
  • the data acquisition unit 111 acquires input data, correct data, and difference information (step S11).
  • the timing when various data are acquired may not be simultaneous.
  • the timing at which the data is acquired may be any time before the processing of the step in which the data is used is performed.
  • the encoder 112 converts the input data into a latent variable vector (step S12).
  • the conversion unit 113 converts the latent variable vector using the value of the conversion parameter according to the difference indicated by the difference information (step S13).
  • the decoder 114 converts the latent variable vector after conversion into output data (step S14).
  • the parameter updating unit 115 determines whether to finish updating the values of the parameters used for the encoder 112 and the decoder 114.
  • the end of the update is, for example, a case where the number of times the parameter update unit 115 has updated the value of the parameter reaches a predetermined number.
  • the end of the update may be when the error of the output data with respect to the correct data is sufficiently small.
  • the parameter updating unit 115 may determine that the error is sufficiently small, and may determine to end the updating. • When the value indicating the error falls below a predetermined reference value • When the error can not be further reduced, or ⁇ When the amount of reduction of the error (that is, the difference between the error immediately before the last update and the error after the update) or the reduction rate (that is, the ratio of the reduction to the current error) falls below a predetermined reference value .
  • the parameter updating unit 115 may calculate the average value or the maximum value of the absolute change amount of the value of each parameter (i.e., the absolute value of the change amount of the parameter value when updating) or the change rate (i.e., the current value). When the average value or the maximum value of the ratio of the absolute change amounts falls below a predetermined reference value, it may be determined that the update is ended.
  • the parameter update unit 115 updates the value of the parameter (step S17), and the variable derivation unit 110 performs the process from step S12 to step S14 again.
  • the encoder 112 and the decoder 114 perform the process using the value of the updated parameter.
  • the parameter updating unit 115 compares the output data newly generated by the process of step S14 with the correct answer data again (step S15), and determines whether the updating of the value of the parameter is finished.
  • the variable derivation unit 110 repeats the updating of the parameter value and the generation of the output data using the updated parameter value until it is determined that the updating of the parameter is completed.
  • the process of updating parameter values through such repetition is learning of variable derivation.
  • the parameter updating unit 115 updates the value of the parameter by learning using, as it were, a set of output data and correct data as a training data set. It is also called optimization to make the value of a parameter more preferable by repeating updating.
  • step S16 If it is determined that the update of the value of the parameter is ended (YES in step S16), the process of learning of the derivation of the variable is ended.
  • variable derivation unit 110 latent variable vectors that respectively express the features of the target in different aspects and have relevance to each other can be derived for the same target.
  • variable derivation unit 110 Based on the specific example described above, an example of the effect achieved by the variable derivation unit 110 is as follows.
  • the encoder 112, the conversion unit 113, and the decoder 114 of the variable derivation unit 110 after the learning is completed can generate a plurality of images indicating objects in different modes according to the conversion parameters. Therefore, the latent variable vector output by the encoder 112 can express the change by transformation even if the aspect of the object in the image changes. That is, according to the combination of the encoder 112 and the transformation unit 113, it is possible to generate latent variable vectors that respectively express the features of the target in different modes and are related to each other.
  • the set of conversion unit 113 and decoder 114 may be able to generate data in which an aspect not included in the correct data is recorded.
  • data in which an object of a certain aspect referred to as “aspect SA”
  • data in which a target of another aspect referred to as “aspect SC”
  • the transformation unit 113 is an aspect between the aspect SA and the aspect SC from the latent variable vector expressing the target of the aspect SA by variable transformation using a half value of the value of the transformation parameter corresponding to the change from the aspect SA to the aspect SC.
  • the set of conversion unit 113 and decoder 114 may be able to generate data in which an aspect not included in the correct data is recorded. For example, in learning of variable derivation, data in which an object with aspect SA (referred to as “object TA”) is recorded, data in which object TA of aspect SB is recorded, and another object of aspect SA (“object TB”) It is assumed that data in which is recorded is used as correct data. By this learning, the set of the transformation unit 113 and the decoder 114 can generate, from the latent variable vector, data in which the target TA of the aspect SA is recorded and data in which the target TA of the aspect SB is recorded.
  • object TA object with aspect SA
  • object TB another object of aspect SA
  • conversion unit 113 can derive the latent variable vector representing the target TB of aspect SB by converting the latent variable vector representing the target TB of aspect SA. Then, the latent variable vector after this conversion is expected to be able to generate data in which the target TB of the aspect SB is recorded by decoding.
  • the encoder 112 may be able to derive a latent variable vector representing an object of the aspect not in the input data.
  • a latent variable vector representing an object of the aspect For example, in learning of variable derivation, it is assumed that data in which the target of aspect SA is recorded and data in which the target of aspect SC is used as input data.
  • the latent variable vector derived is It may be similar (or coincident) to a latent variable vector that can be generated by performing variable transformation from the latent variable vector representing the object. That is, the encoder 112 may be able to derive a latent variable vector that can be converted from a target of an aspect not used for learning to a latent variable vector representing an aspect other than the aspect.
  • the encoder 112 may be able to derive a latent variable vector that represents an object of an aspect not present in the input data.
  • a latent variable vector that represents an object of an aspect not present in the input data.
  • This learning enables the encoder 112 to derive a latent variable vector representing the target TA of the aspect SB. Therefore, it is considered that the encoder 112 can also derive a latent variable vector representing the target TB of the aspect SB from data in which the target TB of the aspect SB is recorded. Then, it is expected that it is also possible to convert the derived latent variable vector to a latent variable vector representing the target TB of the aspect SA by variable conversion.
  • the encoder 112 may be able to derive latent variable vectors that can be mutually transformed by transformation using transformation parameters for the same object in different aspects.
  • the learning device 31 handles differences in any data, object, and aspect as long as the object aspect can acquire information (difference information) indicating differences between two or more data and the data. May be
  • Input data is not limited to optical images.
  • the input data may be anything as long as it records an object whose aspect can change and can be represented by a variable that can be transformed by a neural network.
  • SAR data is sensing data acquired by SAR (Synthetic Aperture Radar).
  • SAR data is terrain, structures, vehicles, aircraft and vessels.
  • aspects that can vary are the azimuthal angle at the time of acquisition of SAR data, and the included angle. That is, the difference resulting from the condition at the time of sensing by the SAR may be adopted as the difference that the learning device 31 handles.
  • the input data may be time series data of sensing data acquired by the sensing device.
  • the input data may be sound data.
  • Sound data is data in which sound is recorded.
  • the input data is sound data, specifically, the input data may be represented by an amplitude with respect to time or an intensity of a spectrogram for each time window.
  • examples of the object are human voice, utterance content, sound event, music, and the like.
  • An acoustic event is a sound that indicates the occurrence of an event, such as a scream or a shattering sound of glass.
  • modes that can be different are frequency (pitch of sound), recording location, degree of echo, timbre, reproduction speed of data (tempo), degree of noise, thing that generated sound , The person who generated the sound, or the state of the person's emotion, etc.
  • classification learning unit 310 The configuration and operation of the classification learning unit 310 will be described.
  • the classification learning unit 310 includes a data acquisition unit 311, a conversion unit 313, a classification unit 317, a parameter update unit 315, an output unit 316, and a parameter storage unit 319.
  • the data acquisition unit 311, the conversion unit 313, the classification unit 317, the parameter update unit 315, and the output unit 316 are realized by, for example, one or more CPUs that execute a program.
  • the parameter storage unit 319 is, for example, a memory.
  • the parameter storage unit 319 may be an auxiliary storage device such as a hard disk.
  • the parameter storage unit 319 may be external to the learning device 31 and configured to be able to communicate with the learning device 31 by wire or wirelessly.
  • the parameter storage unit 319 stores parameters used in the classification performed by the classification unit 317.
  • the learning device 31 may include a storage device that temporarily or non-temporarily stores data.
  • Data used by the classification learning unit 310 are the latent variable vector derived by the encoder 112 and the correct answer information.
  • the correct answer information is information that is desirable as information to be output as a classification result by the classification unit 317 described later. Correct answer information is given as a set with input data. The correct answer information is information to be output when the target indicated in the input data associated with the correct answer information is correctly identified.
  • the classification performed by the classification unit 317 is a multi-class classification that identifies which of the L (L is an arbitrary integer of 2 or more) classes the target belongs to, one of the correct answer information is It may be an L-dimensional vector such that the component value is “1” and the other component values are “0”.
  • Such a vector is also called One-hot data.
  • each component is associated with a class. That is, this One-hot data indicates that the object is classified into the class associated with the component whose value is "1".
  • the classification performed by the classification unit 317 is a binary classification that identifies whether or not the target is a specific object, the information that the correct answer information has a value of “1” or “0” May be.
  • the correct answer information is compared with the classification result of the classification unit 317 in the update of the value of the parameter by the parameter update unit 315 described later.
  • the data acquisition unit 311 may acquire the latent variable vector derived by the encoder 112 by reading the latent variable vector from the latent variable storage unit 118.
  • the conversion function used by the conversion unit 313 is a conversion function of the same type as the conversion unit 113, that is, only the value of the conversion parameter differs at most.
  • the transformation unit 313 may generate a plurality of different latent variable vectors by a plurality of variable transformations using various values of transformation parameters.
  • the classification unit 317 inputs the latent variable vector to the input layer of the neural network using, for example, a neural network, and generates information indicating the classification result as an output.
  • the information indicating the classification result is a multidimensional vector indicating the distribution of the probability (or likelihood) of the object belonging to the class to be classified. .
  • the number of components of the multidimensional vector in such a case is the number of classes to be classified.
  • the information indicating the classification result may be a numerical value indicating the probability that the object is a predetermined recognition object.
  • the information indicating the classification result is data expressed in a form that can be compared with the correct answer information.
  • the structure of the neural network used by the classification unit 317 can be freely designed. For example, there is no limitation on the number of layers, the number of components in the middle layer (if it is a multilayer neural network), and the way in which the components are connected. Further, the activation function used in the neural network used by the classification unit 317 may be any activation function.
  • the classification unit 317 reads out values of parameters (typically, weights and biases) in the neural network to be used from the parameter storage unit 319 and performs classification.
  • parameters typically, weights and biases
  • the parameter updating unit 315 calculates, for one or more sets of information indicating classification results and correct answer information, an error of the information indicating the classification results with respect to the correct answer information.
  • the parameter updating unit 315 may use, for example, a cross entropy as an error function for obtaining an error.
  • the parameter updating unit 315 determines a new parameter value so that the calculated error is smaller.
  • the method for determining the value of the new parameter may be a method adopted in general classifier learning, which is known as a method of optimization of the value of the parameter.
  • the parameter updating unit 315 may calculate the gradient using an error back propagation method, and may use SGD to determine the value of the parameter. Other methods that can be adopted include "RMSprop", "Adagrad”, “Adadelta", "Adam” and the like.
  • the parameter updating unit 315 records the determined new parameter value in the parameter storage unit 319. Thereafter, the classification unit 317 uses the value of the new parameter.
  • the above is the specific procedure of updating.
  • the parameter updating unit 315 may repeatedly update the value of the parameter a predetermined number of times.
  • the predetermined number of times may be determined, for example, as a value received from the user of the learning device 31 via the input interface, a numerical value indicating the predetermined number of times.
  • the output unit 316 outputs the value of the parameter optimized by the parameter updating unit 315 repeatedly updating the value of the parameter.
  • Examples of output destinations of the output by the output unit 316 include a display device, a storage device, and a communication network.
  • the output unit 316 may convert the information so that the display device can display the information.
  • the display device and the storage device described above may be devices outside the learning device 31 or may be components included in the learning device 31.
  • Each process included in the process related to the learning of classification may be performed according to the order of instructions in the program when the process is performed by a device that executes the program.
  • the next process may be performed by notifying the device which has completed the process to the device that executes the next process.
  • each unit that performs processing records, for example, data generated by the respective processing in a storage area included in the learning device 31 or an external storage device.
  • each unit that performs processing may receive data necessary for the respective processing from the unit that generated the data or may read the data from the storage area included in the learning device 31 or an external storage device.
  • the encoder 112 derives a latent variable vector from input data, using parameter values optimized by learning of variable derivation (step S31).
  • the encoder 112 records the derived latent variable vector in the latent variable storage unit 118.
  • the data acquisition unit 311 acquires the latent variable vector derived by the encoder 112 and the correct answer information (step S32).
  • the correct answer information is input to the learning device 31 as a set with the input data. That is, the correct answer information is associated with the input data and the latent variable vector derived from the input data.
  • the converting unit 313 converts the latent variable vector into another latent variable vector (step S33).
  • the classification unit 317 classifies the other latent variable vector (step S34).
  • the parameter updating unit 315 determines whether to end updating of the values of the parameters used for the encoder 112 and the decoder 114.
  • the end of the update is, for example, a case where the number of times the parameter update unit 315 has updated the value of the parameter reaches a predetermined number.
  • the end of the update may be when the error of the output data with respect to the correct data is sufficiently small.
  • the parameter updating unit 315 may determine that the error is sufficiently small, and may determine that the updating is completed. • When the value indicating the error falls below a predetermined reference value • When the error can not be further reduced, or ⁇ When the amount of reduction of the error (that is, the difference between the error immediately before the last update and the error after the update) or the reduction rate (that is, the ratio of the reduction to the current error) falls below a predetermined reference value .
  • the parameter updating unit 315 may calculate the average value or the maximum value of the absolute change amount of the value of each parameter (i.e., the absolute value of the change amount of the parameter value when updating) or the change rate (i.e., the current value). When the average value or the maximum value of the ratio of the absolute change amounts falls below a predetermined reference value, it may be determined that the update is ended.
  • the parameter update unit 315 updates the value of the parameter (step S37), and the classification learning unit 310 performs the processes of step S34 and step S35 again.
  • the classification unit 317 performs classification using the value of the updated parameter.
  • the parameter updating unit 315 compares the classification result newly generated in the process of step S34 with the correct answer information again (step S35), and determines whether the updating of the value of the parameter is finished.
  • the classification learning unit 310 repeats the updating of the parameter value and the classification using the updated parameter value until it is determined that the updating of the parameter is finished.
  • the process of updating parameter values through such repetition is classification learning.
  • the parameter updating unit 315 updates the value of the parameter by learning with the combination of the classification result and the correct answer information as a training data set.
  • step S36 If it is determined that the update of the value of the parameter is completed (YES in step S36), the output unit 316 outputs the value of the parameter (step S38).
  • the classification unit 317 using the value of the updated parameter can output the correct classification result from each of the latent vectors expressing objects of various aspects. Therefore, by combining the encoder 112 and the classification unit 317, it is possible to generate a discriminator capable of identifying objects of various aspects.
  • the learning device may not include the variable derivation unit 110.
  • the learning device is configured to be able to obtain the latent variable vector derived by the encoder configured to derive the latent variable vector mutually convertible by variable transformation for the same object in different aspects. Just do it.
  • FIG. 7 is a block diagram showing the configuration of a learning device 32 according to the second embodiment of the present invention.
  • the learning device 32 includes the configuration included in the classification learning unit 310 in the first embodiment, that is, the data acquisition unit 311, the conversion unit 313, the classification unit 317, the parameter updating unit 315, the output unit 316, and the parameters. And a storage unit 319.
  • the learning device 32 is communicably connected to the encoder 312 by wire or wirelessly.
  • the encoder 312 is, for example, the encoder 112 in the first embodiment.
  • the encoder 112 is configured to derive a latent variable vector using a neural network using values of optimized parameters by learning of variable derivation described in the description of the first embodiment. There is.
  • the learning device 32 can also generate classifiers capable of identifying various aspects of the object.
  • the reason is the same as the reason described in the description of the first embodiment.
  • the encoder 312 may not be the encoder 112 in the first embodiment. Another way of constructing the encoder 312 with the desired function (ie, the function of deriving latent variable vectors that can be converted to each other by variable transformation for the same object in different manners) is described below.
  • the encoder 312 can be generated by performing learning in which various types of targets are correct data and latent variable vectors that can be converted to each other are correct.
  • the output data generated by the decoder 114 in the first embodiment is adopted as the correct solution data
  • the latent variable vector output by the converting unit 113 in the first embodiment is output as the correct solution latent variable vector. May be employed.
  • one of the methods of generating the encoder 312 with the desired function is the following method.
  • a learning device 13 provided with a variable derivation unit 110 as shown in FIG. 8 is prepared.
  • the learning device 13 performs learning of variable derivation described in the first embodiment, using data in which various target TAs are recorded as input data. By doing so, the combination of the encoder 112, the conversion unit 113, and the decoder 114 makes it possible to output the output data in which the various target TAs are recorded.
  • the learning device 13 uses the encoder 112 to derive a latent variable vector from data in which the target TB of an aspect is recorded. Then, the learning device 13 converts the latent variable vector by variable conversion to generate output data, and thereby acquires a set of output data in which the target TB of the unlearned aspect is recorded and the latent variable vector.
  • the encoder 312 performs learning to derive the correct latent variable vector from the data in which the target TB of the unlearned aspect is recorded. This enables the encoder 312 to derive a latent variable vector that can be converted into a latent variable vector that represents the target TB of the learned aspect from the data in which the target TB of the unlearned aspect is recorded.
  • the data that needs to be prepared in the above method is data in which the target TAs of various aspects are respectively recorded, and data in which the target TB of an aspect is recorded. There is no need to prepare data in which the target TB in the unlearned aspect is recorded.
  • FIG. 9 is a block diagram showing the configuration of the learning device 30. As shown in FIG.
  • the learning device 30 includes a data acquisition unit 301, a conversion unit 303, and a parameter update unit 305.
  • the data acquisition unit 301 acquires a first feature amount derived from data in which an identification target is recorded.
  • the first feature quantity is such that, from data in which the same objects in different aspects are respectively recorded, mutually deriving feature quantities that can be mutually transformed by transformation using a transformation parameter that takes a value according to the difference in the aspect It is the feature quantity derived
  • the method for implementing the above encoder is as described above.
  • the feature amount indicates a set of values derived from input data by the encoder.
  • the feature amount may be called information representing an object, representation of data, or the like. Deriving feature quantities may also be referred to as "extracting feature quantities”.
  • the “potential variable vector” in each of the above embodiments corresponds to the “feature amount” in this embodiment.
  • the feature amount may be held in the form of an array, or may be held as a value of a variable given a name.
  • the conversion unit 303 generates a second feature amount by performing conversion using a conversion parameter on the first feature amount acquired by the data acquisition unit 301.
  • the parameter updating unit 305 updates the value of a parameter (hereinafter also referred to as “classification parameter”) used for classification by a classifier (not shown).
  • the classifier is a module configured to perform classification with the feature amount as an input.
  • the classification unit 317 in each of the above embodiments corresponds to this classifier.
  • the classifier may or may not be included in the learning device 30.
  • the learning device 30 and a device having a classifier function may be communicably connected to each other.
  • the classification parameters may be stored by a learning device or may be stored by a device having a classifier function.
  • the classification parameters are, for example, weights and biases generally used in neural networks.
  • the parameter updating unit 305 updates the value of the classification parameter so that the classifier outputs a result indicating the class associated with the identification target as the classification target when the second feature amount is input. That is, the learning device 30 performs learning using, as training data, a set of the second feature amount and the result indicating the class associated with the identification target as the classification destination.
  • To update the value of the classification parameter means, for example, recording a new value of the classification parameter in a storage unit that stores the classification parameter.
  • the parameter updating unit 305 may output the new value of the classification parameter to a device (for example, a storage device, a display device, or an information processing device using a classifier) outside the learning device 30.
  • the data acquisition unit 301 acquires a first feature amount (step S301).
  • the converting unit 303 generates a second feature amount by performing conversion using the conversion parameter on the first feature amount (step S302).
  • the parameter updating unit 305 updates the value of the classification parameter so that the classifier outputs the result indicating the class associated with the identification target as the classification destination when the second feature amount is input (step S303).
  • the learning device 30 even in the case where there are few samples of data in which objects are recorded, it is possible to generate a discriminator capable of identifying objects of various aspects. The reason is that if the classifier uses the updated classification parameter value, the data recorded with the identification target that can be represented by the second feature value is correctly (that is, even if it is not used in learning) It is because it is classified into the class associated with the identification object.
  • the processing of each component may be realized, for example, by the computer system reading and executing a program stored in a computer readable storage medium that causes the computer system to execute the processing.
  • the “computer-readable storage medium” is, for example, a portable medium such as an optical disc, a magnetic disc, a magneto-optical disc, and a nonvolatile semiconductor memory, and a ROM (Read Only Memory) and a hard disc incorporated in a computer system. It is a storage device.
  • the "computer-readable storage medium” is one that can temporarily hold a program, such as volatile memory in a computer system, and one that transmits a program, such as a communication line such as a network or a telephone line.
  • the program may be for realizing a part of the functions described above, and may be capable of realizing the functions described above in combination with a program already stored in the computer system. .
  • the “computer system” is, as an example, a system including a computer 900 as shown in FIG.
  • the computer 900 includes the following configuration. ⁇ One or more CPUs 901 ROM 902 RAM (Random Access Memory) 903 ⁇ Program 904A loaded into RAM 903 and stored information 904B A storage device 905 for storing the program 904A and the stored information 904B . Drive device 907 for reading and writing the storage medium 906 Communication interface 908 connected to communication network 909 ⁇ Input / output interface 910 for data input / output .Bus 911 connecting each component
  • each component of each device in each embodiment is realized by the CPU 901 loading and executing a program 904A that implements the function of the component to the RAM 903.
  • a program 904A for realizing the function of each component of each device is stored in advance in, for example, the storage device 905 or the ROM 902. Then, the CPU 901 reads the program 904A as necessary.
  • the storage device 905 is, for example, a hard disk.
  • the program 904A may be supplied to the CPU 901 via the communication network 909, may be stored in advance in the storage medium 906, may be read by the drive device 907, and may be supplied to the CPU 901.
  • the storage medium 906 is, for example, a portable medium such as an optical disc, a magnetic disc, a magneto-optical disc, and a nonvolatile semiconductor memory.
  • each device may be realized by possible combination of separate computer 900 and program for each component.
  • a plurality of components included in each device may be realized by a possible combination of one computer 900 and a program.
  • each component of each device may be realized by another general purpose or dedicated circuit, a computer or the like, or a combination thereof. These may be configured by a single chip or may be configured by a plurality of chips connected via a bus.
  • each component of each device When a part or all of each component of each device is realized by a plurality of computers, circuits, etc., the plurality of computers, circuits, etc. may be centralized or distributed.
  • a computer, a circuit, etc. may be realized as a form in which each is connected via a communication network, such as a client and server system, a cloud computing system, and the like.
  • An encoder configured to derive mutually transformable feature quantities from data in which identical objects in different modes are respectively recorded, by conversion using conversion parameters taking values according to the difference in the modes.
  • An acquisition unit configured to acquire a first feature value derived from data in which an identification target is recorded;
  • a conversion unit configured to generate a second feature amount by performing conversion using the value of the conversion parameter on the first feature amount;
  • the classification unit configured to perform classification using a feature as input outputs the result indicating as a classification destination a class associated with the identification target when the second feature is input.
  • Parameter updating means for updating the values of classification parameters used for classification by A learning device comprising: [Supplementary Note 2]
  • the conversion means generates a plurality of second feature amounts from the first feature amount by performing a plurality of conversions using values of the different conversion parameters.
  • the parameter updating means updates the value of the classification parameter such that the classification means outputs a result indicating the class associated with the identification target as a classification destination, when any of the plurality of second feature quantities is input.
  • the learning device according to appendix 1.
  • the conversion means performs the conversion to change the arrangement of the components of the first feature amount.
  • the learning device according to Appendix 1 or 2.
  • the data is an image, and the identification target is an object or a person.
  • the learning device according to any one of appendices 1 to 3.
  • the data is an image generated from sensing data by SAR (Synthetic Aperture Radar), and the difference between the modes is a difference due to the condition at the time of sensing by SAR.
  • the learning device according to any one of appendices 1 to 3.
  • the learning device according to any one of appendices 1 to 5, further comprising: the classification unit that performs classification using the second feature amount as an input.
  • the learning device according to any one of appendices 1 to 6, further comprising the encoder.
  • An encoder configured to derive mutually transformable feature quantities from data in which identical objects in different modes are respectively recorded, by conversion using conversion parameters taking values according to the difference in the modes. Acquiring a first feature value derived from data in which an identification target is recorded; A second feature amount is generated by performing conversion using the value of the conversion parameter on the first feature amount.
  • the classification unit configured to perform classification using a feature as input outputs the result indicating as a classification destination a class associated with the identification target when the second feature is input.
  • a plurality of second feature quantities are generated from the first feature quantity by a plurality of transformations respectively using values of the different transformation parameters;
  • the value of the classification parameter is updated such that the classification unit outputs, as a classification destination, a result indicating a class associated with the identification target regardless of which of the plurality of second feature quantities is input.
  • the data is an image, and the identification target is an object or a person.
  • the learning method according to any one of appendices 9 to 11.
  • the data is an image generated from sensing data by SAR (Synthetic Aperture Radar), and the difference between the modes is a difference due to the condition at the time of sensing by SAR.
  • the learning method according to any one of appendices 9 to 11.
  • An encoder configured to derive mutually transformable feature quantities from data in which identical objects in different modes are respectively recorded, by conversion using conversion parameters taking values according to the difference in the modes.
  • the classification unit configured to perform classification using a feature as input outputs the result indicating as a classification destination a class associated with the identification target when the second feature is input. Updating the value of the classification parameter used for classification according to A computer readable storage medium storing a program that causes a computer to execute the program.
  • the conversion processing generates a plurality of second feature amounts from the first feature amount by a plurality of conversions respectively using values of the different conversion parameters.
  • the parameter updating process updates the value of the classification parameter so that the classification unit outputs a result indicating a class associated with the identification target as a classification destination, when any of the plurality of second feature quantities is input.
  • the storage medium according to appendix 15. The conversion process performs the conversion to change the arrangement of the components of the first feature amount.
  • the data is an image, and the identification target is an object or a person.
  • the data is an image generated from sensing data by SAR (Synthetic Aperture Radar), and the difference between the modes is a difference due to the condition at the time of sensing by SAR.
  • SAR Synthetic Aperture Radar
  • the storage medium according to any one of appendices 15-17.

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

L'invention concerne un dispositif d'apprentissage qui peut générer un discriminateur qui permet d'identifier un objet ayant diverses formes, même s'il y a peu d'échantillons de données avec l'objet enregistré à l'intérieur de ceux-ci. Un dispositif d'apprentissage selon un mode de réalisation comprend : une unité d'acquisition qui acquiert une première quantité de caractéristiques dérivée par un codeur à partir de données avec un objet d'identification enregistré dans celles-ci, le codeur étant configuré de façon à dériver, à partir de données avec l'objet identique dans diverses formes enregistrées à l'intérieur de celles-ci, des quantités de caractéristiques qui sont mutuellement convertibles par une conversion à l'aide d'un paramètre de conversion qui prend une valeur en fonction de la différence dans les formes ; une unité de conversion qui génère une seconde quantité de caractéristiques en effectuant une conversion sur la première quantité de caractéristiques à l'aide de la valeur de paramètre de conversion ; et une unité de mise à jour de paramètre qui met à jour la valeur d'un paramètre de tri utilisé dans le tri par un moyen de tri, qui est configuré pour trier des secondes quantités de caractéristiques en tant qu'entrée, de telle sorte que si la seconde quantité de caractéristiques a été entrée, le moyen de tri délivre un résultat indiquant, en tant que destination de tri, une classe associée à l'objet d'identification.
PCT/JP2017/044894 2017-12-14 2017-12-14 Dispositif d'apprentissage, procédé d'apprentissage, procédé de tri et support d'enregistrement Ceased WO2019116494A1 (fr)

Priority Applications (4)

Application Number Priority Date Filing Date Title
PCT/JP2017/044894 WO2019116494A1 (fr) 2017-12-14 2017-12-14 Dispositif d'apprentissage, procédé d'apprentissage, procédé de tri et support d'enregistrement
JP2019559490A JP7184801B2 (ja) 2017-12-14 2017-12-14 学習装置、学習方法、および学習プログラム
EP17934746.3A EP3726463B1 (fr) 2017-12-14 2017-12-14 Dispositif d'apprentissage, procédé d'apprentissage, procédé de tri et support d'enregistrement
US16/772,035 US11270163B2 (en) 2017-12-14 2017-12-14 Learning device, learning method, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2017/044894 WO2019116494A1 (fr) 2017-12-14 2017-12-14 Dispositif d'apprentissage, procédé d'apprentissage, procédé de tri et support d'enregistrement

Publications (1)

Publication Number Publication Date
WO2019116494A1 true WO2019116494A1 (fr) 2019-06-20

Family

ID=66819132

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2017/044894 Ceased WO2019116494A1 (fr) 2017-12-14 2017-12-14 Dispositif d'apprentissage, procédé d'apprentissage, procédé de tri et support d'enregistrement

Country Status (4)

Country Link
US (1) US11270163B2 (fr)
EP (1) EP3726463B1 (fr)
JP (1) JP7184801B2 (fr)
WO (1) WO2019116494A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102563953B1 (ko) * 2022-12-20 2023-08-04 국방과학연구소 영상의 잠재특징을 이용한 영상 변환방법 및 장치

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07121713A (ja) * 1993-10-21 1995-05-12 Kobe Steel Ltd パターン認識方法
JPH1115973A (ja) * 1997-06-23 1999-01-22 Mitsubishi Electric Corp 画像認識装置
JP2004062719A (ja) 2002-07-31 2004-02-26 Fuji Xerox Co Ltd 画像処理装置
JP2013008364A (ja) * 2011-06-22 2013-01-10 Boeing Co:The 画像の表示

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SE528068C2 (sv) * 2004-08-19 2006-08-22 Jan Erik Solem Med Jsolutions Igenkänning av 3D föremål
JP5825641B2 (ja) * 2010-07-23 2015-12-02 国立研究開発法人産業技術総合研究所 病理組織画像の特徴抽出システム及び病理組織画像の特徴抽出方法
US9700219B2 (en) * 2013-10-17 2017-07-11 Siemens Healthcare Gmbh Method and system for machine learning based assessment of fractional flow reserve
JP2016197389A (ja) 2015-04-03 2016-11-24 株式会社デンソーアイティーラボラトリ 学習システム、学習プログラムおよび学習方法
US10810469B2 (en) * 2018-05-09 2020-10-20 Adobe Inc. Extracting material properties from a single image
US10719706B1 (en) * 2018-06-19 2020-07-21 Architecture Technology Corporation Systems and methods for nested autoencoding of radar for neural image analysis
US12475280B2 (en) * 2020-03-23 2025-11-18 Ansys, Inc. Generative networks for physics based simulations

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07121713A (ja) * 1993-10-21 1995-05-12 Kobe Steel Ltd パターン認識方法
JPH1115973A (ja) * 1997-06-23 1999-01-22 Mitsubishi Electric Corp 画像認識装置
JP2004062719A (ja) 2002-07-31 2004-02-26 Fuji Xerox Co Ltd 画像処理装置
JP2013008364A (ja) * 2011-06-22 2013-01-10 Boeing Co:The 画像の表示

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ASHIHARA, YUTA ET AL.: "Middle layers sharing for transfer learning to predict rotating image in Deep Learning", IPSJ SIG TECHNICAL REPORT, INTELLIGENT SYSTEM (ICS), no. 1, 24 February 2016 (2016-02-24), pages 1 - 8, XP009521008, ISSN: 2188-885X *
See also references of EP3726463A4

Also Published As

Publication number Publication date
JP7184801B2 (ja) 2022-12-06
US20210081721A1 (en) 2021-03-18
JPWO2019116494A1 (ja) 2020-11-26
US11270163B2 (en) 2022-03-08
EP3726463A4 (fr) 2020-12-23
EP3726463B1 (fr) 2025-02-19
EP3726463A1 (fr) 2020-10-21

Similar Documents

Publication Publication Date Title
CN108961350B (zh) 一种基于显著度匹配的画风迁移方法
CN113592991A (zh) 一种基于神经辐射场的图像渲染方法、装置及电子设备
JP2005202932A (ja) データを複数のクラスに分類する方法
WO2020239208A1 (fr) Procédé et système d'entraînement d'un modèle aux fins de génération d'image
KR20210076691A (ko) 프레임워크 간 뉴럴 네트워크의 학습을 검증하는 방법 및 장치
CN117437395A (zh) 目标检测模型训练方法、目标检测方法及装置
US20240013357A1 (en) Recognition system, recognition method, program, learning method, trained model, distillation model and training data set generation method
CN115409694B (zh) 语义引导的缺陷图像生成方法、装置、设备及存储介质
CN112132167A (zh) 图像生成和神经网络训练方法、装置、设备和介质
JP2005092465A (ja) データ認識装置
JP6943295B2 (ja) 学習装置、学習方法、および学習プログラム
CN118097108A (zh) 目标检测模型的训练方法、目标检测方法、设备及介质
CN111046893A (zh) 图像相似性确定方法和装置、图像处理方法和装置
CN116612364B (zh) 一种基于信息最大化生成对抗网络的sar图像目标生成方法
US11176420B2 (en) Identification device, identification method, and storage medium
EP3903235B1 (fr) Identification de caractéristiques pertinentes pour des réseaux génératifs
WO2019116494A1 (fr) Dispositif d'apprentissage, procédé d'apprentissage, procédé de tri et support d'enregistrement
CN117437394A (zh) 模型训练方法、目标检测方法及装置
JP2020067954A (ja) 情報処理装置、情報処理方法及びプログラム
CN116758261B (zh) 基于生成对抗网络的宽带前视成像雷达目标识别方法
CN117671350A (zh) 基于深度条件生成模型的sar目标图像生成方法
CN118097693A (zh) 一种基于领域噪音自适应的文档布局分析方法和系统
US20220326386A1 (en) Computer-implemented method and system for generating synthetic sensor data, and training method
CN112862758A (zh) 用于检测墙壁顶面的油漆涂抹质量的神经网络的训练方法
KR102512018B1 (ko) 클래스 인지 메모리 네트워크 기반 영상 변환 장치 및 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17934746

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019559490

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2017934746

Country of ref document: EP

Effective date: 20200714