CN111753859A - Sample generation method, device and device - Google Patents
Sample generation method, device and device Download PDFInfo
- Publication number
- CN111753859A CN111753859A CN201910233792.XA CN201910233792A CN111753859A CN 111753859 A CN111753859 A CN 111753859A CN 201910233792 A CN201910233792 A CN 201910233792A CN 111753859 A CN111753859 A CN 111753859A
- Authority
- CN
- China
- Prior art keywords
- vector
- neural network
- feature
- standard
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
本发明提供一种样本生成方法、装置及设备,样本生成方法包括:获取指定标准字的特征描述向量,特征描述向量用于指示所述指定标准字的内容;利用所述特征描述向量和指定的非标准特征向量将指定标准字转换成目标样本,所述目标样本对应的风格与非标准特征向量表示的风格相同。无需采集所需字体风格的字符图像便可生成该字体风格的样本,提升样本生成效率。
The present invention provides a sample generation method, device and device. The sample generation method includes: acquiring a feature description vector of a specified standard word, where the feature description vector is used to indicate the content of the specified standard word; using the feature description vector and the specified standard word The non-standard feature vector converts a specified standard word into a target sample, and the style corresponding to the target sample is the same as that represented by the non-standard feature vector. A sample of the font style can be generated without collecting character images of the required font style, which improves the sample generation efficiency.
Description
技术领域technical field
本发明涉及图像处理技术领域,尤其涉及的是一种样本生成方法、装置及设备。The present invention relates to the technical field of image processing, and in particular, to a sample generation method, device and device.
背景技术Background technique
随着科学技术的发展,深度学习算法在分类、检测、识别等任务中表现优异。但该性能的取得依赖于计算机算力的提升、大量的训练样本等多个方面因素,其中训练样本作为“燃料”是算法开发中不可或缺的一环。在文本识别技术中,同样需要大量的包含字符的样本来实现训练。With the development of science and technology, deep learning algorithms perform well in tasks such as classification, detection, and recognition. However, the achievement of this performance depends on many factors such as the improvement of computer computing power and a large number of training samples. Among them, the training samples as "fuel" are an indispensable part of algorithm development. In text recognition technology, a large number of samples containing characters are also required for training.
相关的样本生成方式中,通过将字符图像贴到背景图像合成为样本,在真实场景中,文本字符的字体是多样化的,为了使得算法能够更准确地识别真实场景中的文本字符,需要生成训练所需的各种字体风格的样本,该方式中,每需要一种字体风格的样本,就需要采集相应字体风格的字符图像来合成所需的样本,样本生成效率过低。In the related sample generation method, the character image is pasted to the background image to synthesize the sample. In the real scene, the fonts of the text characters are diverse. In order to enable the algorithm to more accurately identify the text characters in the real scene, it is necessary to generate Samples of various font styles required for training. In this method, each time a font style sample is required, character images of the corresponding font style need to be collected to synthesize the required samples, and the sample generation efficiency is too low.
发明内容SUMMARY OF THE INVENTION
有鉴于此,本发明提供一种样本生成方法、装置及设备,无需采集所需字体风格的字符图像便可生成该字体风格的样本,提升样本生成效率。In view of this, the present invention provides a sample generation method, device and device, which can generate samples of the font style without collecting character images of the required font style, thereby improving the sample generation efficiency.
本发明第一方面提供一种样本生成方法,包括:A first aspect of the present invention provides a sample generation method, comprising:
获取指定标准字的特征描述向量,特征描述向量用于指示所述指定标准字的内容;Obtain the feature description vector of the specified standard word, and the feature description vector is used to indicate the content of the specified standard word;
利用所述特征描述向量和指定的非标准特征向量将指定标准字转换成目标样本,所述目标样本对应的风格与非标准特征向量表示的风格相同。The specified standard word is converted into a target sample by using the feature description vector and the specified non-standard feature vector, and the style corresponding to the target sample is the same as the style represented by the non-standard feature vector.
根据本发明的一个实施例,所述获取指定标准字的特征描述向量,包括:According to an embodiment of the present invention, the obtaining the feature description vector of the specified standard word includes:
将包含有所述指定标准字的第一图像输入至已训练的学生网络中的第一神经网络,以由所述第一神经网络对输入的所述第一图像进行特征提取得到特征描述向量。The first image containing the specified standard word is input to the first neural network in the trained student network, so that the first neural network performs feature extraction on the inputted first image to obtain a feature description vector.
根据本发明的一个实施例,所述第一神经网络对输入的所述第一图像进行特征提取得到特征描述向量,包括:According to an embodiment of the present invention, the first neural network performs feature extraction on the inputted first image to obtain a feature description vector, including:
所述第一神经网络至少通过用于执行卷积处理的卷积层、及用于执行非线性变换处理的第一非线性变换层对所述第一图像进行特征提取得到特征描述向量。The first neural network performs feature extraction on the first image through at least a convolution layer for performing convolution processing and a first nonlinear transformation layer for performing nonlinear transformation processing to obtain a feature description vector.
根据本发明的一个实施例,利用所述特征描述向量和指定的非标准特征向量将指定标准字转换成目标样本,包括:According to an embodiment of the present invention, using the feature description vector and the specified non-standard feature vector to convert a specified standard word into a target sample, including:
将所述特征描述向量与所述非标准特征向量输入至已训练的学生网络中的第二神经网络,以由所述第二神经网络将所述特征描述向量与所述非标准特征向量进行融合得到融合向量,利用所述融合向量生成第二图像;inputting the feature description vector and the non-standard feature vector into a second neural network in the trained student network to fuse the feature description vector with the non-standard feature vector by the second neural network obtaining a fusion vector, and using the fusion vector to generate a second image;
将所述第二图像确定为所述目标样本。The second image is determined as the target sample.
根据本发明的一个实施例,所述第二神经网络包括融合层;所述特征描述向量与所述非标准特征向量的维度相同;According to an embodiment of the present invention, the second neural network includes a fusion layer; the feature description vector has the same dimension as the non-standard feature vector;
所述第二神经网络将所述特征描述向量与所述非标准特征向量进行融合得到融合向量,包括:The second neural network fuses the feature description vector and the non-standard feature vector to obtain a fusion vector, including:
所述第二神经网络利用融合层将所述特征描述向量与所述非标准特征向量执行叠加处理得到所述融合向量。The second neural network uses a fusion layer to perform superposition processing on the feature description vector and the non-standard feature vector to obtain the fusion vector.
根据本发明的一个实施例,所述第二神经网络包括全连接层和融合层;所述特征描述向量与所述非标准特征向量的维度不同;According to an embodiment of the present invention, the second neural network includes a fully connected layer and a fusion layer; the dimension of the feature description vector is different from that of the non-standard feature vector;
所述第二神经网络将所述特征描述向量与所述非标准特征向量进行融合得到融合向量,包括:The second neural network fuses the feature description vector and the non-standard feature vector to obtain a fusion vector, including:
所述第二神经网络利用全连接层将所述非标准特征向量映射为维度与所述特征描述向量的维度相同的参考向量;The second neural network utilizes a fully connected layer to map the non-standard feature vector into a reference vector whose dimension is the same as that of the feature description vector;
所述第二神经网络利用融合层将所述特征描述向量与参考向量执行叠加处理得到所述融合向量。The second neural network uses a fusion layer to perform superposition processing on the feature description vector and the reference vector to obtain the fusion vector.
根据本发明的一个实施例,所述第二神经网络包括融合层;According to an embodiment of the present invention, the second neural network includes a fusion layer;
所述第二神经网络将所述特征描述向量与所述非标准特征向量进行融合得到融合向量,包括:The second neural network fuses the feature description vector and the non-standard feature vector to obtain a fusion vector, including:
所述第二神经网络利用融合层将所述特征描述向量与所述非标准特征向量进行合并得到所述融合向量。The second neural network uses a fusion layer to combine the feature description vector and the non-standard feature vector to obtain the fusion vector.
根据本发明的一个实施例,所述第二神经网络还包括:用于执行反卷积处理的反卷积层、及用于执行非线性变换的第二非线性变换层;According to an embodiment of the present invention, the second neural network further includes: a deconvolution layer for performing deconvolution processing, and a second nonlinear transformation layer for performing nonlinear transformation;
所述第二神经网络利用所述融合向量生成第二图像包括:The second neural network using the fusion vector to generate the second image includes:
所述第二神经网络利用所述反卷积层、第二非线性变换层生成与所述融合向量对应的第二图像。The second neural network uses the deconvolution layer and the second nonlinear transformation layer to generate a second image corresponding to the fusion vector.
根据本发明的一个实施例,所述学生网络是在已训练的教师网络监督下训练得到的;According to an embodiment of the present invention, the student network is trained under the supervision of a trained teacher network;
所述第二神经网络中至少一层的网络参数应用了所述教师网络中对应层的网络参数。The network parameters of at least one layer in the second neural network apply the network parameters of the corresponding layer in the teacher network.
本发明第二方面提供一种样本生成装置,包括:A second aspect of the present invention provides a sample generation device, comprising:
特征描述向量获取模块,用于获取指定标准字的特征描述向量,特征描述向量用于指示所述指定标准字的内容;a feature description vector acquisition module, used to obtain a feature description vector of a specified standard word, and the feature description vector is used to indicate the content of the specified standard word;
目标样本生成模块,用于利用所述特征描述向量和指定的非标准特征向量将指定标准字转换成目标样本,所述目标样本对应的风格与非标准特征向量表示的风格相同。The target sample generation module is used to convert the specified standard word into a target sample by using the feature description vector and the specified non-standard feature vector, and the style corresponding to the target sample is the same as the style represented by the non-standard feature vector.
根据本发明的一个实施例,所述特征描述向量获取模块具体用于:According to an embodiment of the present invention, the feature description vector obtaining module is specifically used for:
将包含有所述指定标准字的第一图像输入至已训练的学生网络中的第一神经网络,以由所述第一神经网络对输入的所述第一图像进行特征提取得到特征描述向量。The first image containing the specified standard word is input to the first neural network in the trained student network, so that the first neural network performs feature extraction on the inputted first image to obtain a feature description vector.
根据本发明的一个实施例,所述第一神经网络对输入的所述第一图像进行特征提取得到特征描述向量,包括:According to an embodiment of the present invention, the first neural network performs feature extraction on the inputted first image to obtain a feature description vector, including:
所述第一神经网络至少通过用于执行卷积处理的卷积层、及用于执行非线性变换处理的第一非线性变换层对所述第一图像进行特征提取得到特征描述向量。The first neural network performs feature extraction on the first image through at least a convolution layer for performing convolution processing and a first nonlinear transformation layer for performing nonlinear transformation processing to obtain a feature description vector.
根据本发明的一个实施例,所述目标样本生成模块包括:According to an embodiment of the present invention, the target sample generation module includes:
图像生成单元,用于将所述特征描述向量与所述非标准特征向量输入至已训练的学生网络中的第二神经网络,以由所述第二神经网络将所述特征描述向量与所述非标准特征向量进行融合得到融合向量,利用所述融合向量生成第二图像;An image generation unit for inputting the feature description vector and the non-standard feature vector to a second neural network in the trained student network, so that the second neural network combines the feature description vector with the The non-standard feature vectors are fused to obtain a fusion vector, and the fusion vector is used to generate a second image;
目标样本确定单元,用于将所述第二图像确定为所述目标样本。A target sample determination unit, configured to determine the second image as the target sample.
根据本发明的一个实施例,所述第二神经网络包括融合层;所述特征描述向量与所述非标准特征向量的维度相同;According to an embodiment of the present invention, the second neural network includes a fusion layer; the feature description vector has the same dimension as the non-standard feature vector;
所述第二神经网络将所述特征描述向量与所述非标准特征向量进行融合得到融合向量时,具体用于:When the second neural network fuses the feature description vector and the non-standard feature vector to obtain a fusion vector, it is specifically used for:
所述第二神经网络利用融合层将所述特征描述向量与所述非标准特征向量执行叠加处理得到所述融合向量。The second neural network uses a fusion layer to perform superposition processing on the feature description vector and the non-standard feature vector to obtain the fusion vector.
根据本发明的一个实施例,所述第二神经网络包括全连接层和融合层;所述特征描述向量与所述非标准特征向量的维度不同;According to an embodiment of the present invention, the second neural network includes a fully connected layer and a fusion layer; the dimension of the feature description vector is different from that of the non-standard feature vector;
所述第二神经网络将所述特征描述向量与所述非标准特征向量进行融合得到融合向量时,具体用于:When the second neural network fuses the feature description vector and the non-standard feature vector to obtain a fusion vector, it is specifically used for:
所述第二神经网络利用全连接层将所述非标准特征向量映射为维度与所述特征描述向量的维度相同的参考向量;The second neural network utilizes a fully connected layer to map the non-standard feature vector into a reference vector whose dimension is the same as that of the feature description vector;
所述第二神经网络利用融合层将所述特征描述向量与参考向量执行叠加处理得到所述融合向量。The second neural network uses a fusion layer to perform superposition processing on the feature description vector and the reference vector to obtain the fusion vector.
根据本发明的一个实施例,所述第二神经网络包括融合层;According to an embodiment of the present invention, the second neural network includes a fusion layer;
所述第二神经网络将所述特征描述向量与所述非标准特征向量进行融合得到融合向量时,具体用于:When the second neural network fuses the feature description vector and the non-standard feature vector to obtain a fusion vector, it is specifically used for:
所述第二神经网络利用融合层将所述特征描述向量与所述非标准特征向量进行合并得到所述融合向量。The second neural network uses a fusion layer to combine the feature description vector and the non-standard feature vector to obtain the fusion vector.
根据本发明的一个实施例,所述第二神经网络还包括:用于执行反卷积处理的反卷积层、及用于执行非线性变换的第二非线性变换层;According to an embodiment of the present invention, the second neural network further includes: a deconvolution layer for performing deconvolution processing, and a second nonlinear transformation layer for performing nonlinear transformation;
所述第二神经网络利用所述融合向量生成第二图像时,具体用于:When the second neural network uses the fusion vector to generate the second image, it is specifically used for:
所述第二神经网络利用所述反卷积层、第二非线性变换层生成与所述融合向量对应的第二图像。The second neural network uses the deconvolution layer and the second nonlinear transformation layer to generate a second image corresponding to the fusion vector.
根据本发明的一个实施例,所述学生网络是在已训练的教师网络监督下训练得到的;According to an embodiment of the present invention, the student network is trained under the supervision of a trained teacher network;
所述第二神经网络中至少一层的网络参数应用了所述教师网络中对应层的网络参数。The network parameters of at least one layer in the second neural network apply the network parameters of the corresponding layer in the teacher network.
本发明第三方面提供一种电子设备,包括处理器及存储器;所述存储器存储有可被处理器调用的程序;其中,所述处理器执行所述程序时,实现如前述实施例中所述的样本生成方法。A third aspect of the present invention provides an electronic device, including a processor and a memory; the memory stores a program that can be called by the processor; wherein, when the processor executes the program, the implementation is as described in the foregoing embodiments sample generation method.
本发明第四方面提供一种机器可读存储介质,其特征在于,其上存储有程序,该程序被处理器执行时,实现如前述实施例中所述的样本生成方法。A fourth aspect of the present invention provides a machine-readable storage medium, characterized in that a program is stored thereon, and when the program is executed by a processor, the sample generation method described in the foregoing embodiments is implemented.
本发明实施例具有以下有益效果:The embodiment of the present invention has the following beneficial effects:
本发明实施例中,可利用用于指示指定标准字的内容的特征描述向量、及表示某种风格的非标准特征向量,将指定标准字转换成该风格的目标样本,无需采集该风格下的字符图像来合成样本,提升了样本的生成效率,而且可根据需要生成各种风格下的包含不同文字内容的样本,实现样本的多样性。In the embodiment of the present invention, a feature description vector used to indicate the content of a specified standard word and a non-standard feature vector representing a certain style can be used to convert the specified standard word into a target sample of the style, without collecting data in the style. Character images are used to synthesize samples, which improves the efficiency of sample generation, and can generate samples with different text contents in various styles according to needs, so as to realize the diversity of samples.
附图说明Description of drawings
图1是本发明一实施例的样本生成方法的流程示意图;1 is a schematic flowchart of a sample generation method according to an embodiment of the present invention;
图2是本发明一实施例的样本生成装置的结构框图;2 is a structural block diagram of a sample generating apparatus according to an embodiment of the present invention;
图3是本发明一实施例的第一神经网络与第二神经网络的连接结构框图;3 is a block diagram of a connection structure of a first neural network and a second neural network according to an embodiment of the present invention;
图4是本发明一实施例的第一神经网络与第二神经网络的一种训练方式的示意图;4 is a schematic diagram of a training method of the first neural network and the second neural network according to an embodiment of the present invention;
图5是本发明一实施例的第一神经网络与第二神经网络的另一种训练方式的示意图;5 is a schematic diagram of another training method of the first neural network and the second neural network according to an embodiment of the present invention;
图6是本发明一实施例的电子设备的结构框图。FIG. 6 is a structural block diagram of an electronic device according to an embodiment of the present invention.
具体实施方式Detailed ways
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本发明相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本发明的一些方面相一致的装置和方法的例子。Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. Where the following description refers to the drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the illustrative examples below are not intended to represent all implementations consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with some aspects of the invention as recited in the appended claims.
在本发明使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本发明。在本发明和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。The terminology used in the present invention is for the purpose of describing particular embodiments only and is not intended to limit the present invention. As used in this specification and the appended claims, the singular forms "a," "the," and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It will also be understood that the term "and/or" as used herein refers to and includes any and all possible combinations of one or more of the associated listed items.
应当理解,尽管在本发明可能采用术语第一、第二、第三等来描述各种器件,但这些信息不应限于这些术语。这些术语仅用来将同一类型的器件彼此区分开。例如,在不脱离本发明范围的情况下,第一器件也可以被称为第二器件,类似地,第二器件也可以被称为第一器件。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。It should be understood that although the terms first, second, third, etc. may be used herein to describe various devices, this information should not be limited by these terms. These terms are only used to distinguish devices of the same type from one another. For example, a first device could be termed a second device, and similarly, a second device could be termed a first device, without departing from the scope of the present invention. Depending on the context, the word "if" as used herein can be interpreted as "at the time of" or "when" or "in response to determining."
为了使得本发明的描述更清楚简洁,下面对本发明中的一些技术术语进行解释:In order to make the description of the present invention clearer and concise, some technical terms in the present invention are explained below:
神经网络:一种通过模仿大脑结构抽象而成的技术,该技术将大量简单的函数进行复杂的连接,形成一个网络系统,该系统可以拟合极其复杂的函数关系,一般可以包括卷积/反卷积操作、激活操作、池化操作,以及加减乘除、通道合并、元素重新排列等操作。使用特定的输入数据和输出数据对网络进行训练,调整其中的连接,可以让神经网络学习拟合输入和输出之间的映射关系。Neural network: a technology abstracted by imitating the structure of the brain, which connects a large number of simple functions intricately to form a network system, which can fit extremely complex functional relationships, generally including convolution/inverse Convolution operations, activation operations, pooling operations, and operations such as addition, subtraction, multiplication and division, channel merging, and element rearrangement. Training the network with specific input and output data, and adjusting the connections, allows the neural network to learn to fit the mapping between input and output.
下面对本发明实施例的样本生成方法进行更具体的描述,但不应以此为限。在一个实施例中,参看图1,一种样本生成方法,可以包括以下步骤:The sample generation method according to the embodiment of the present invention will be described in more detail below, but it should not be limited thereto. In one embodiment, referring to FIG. 1, a sample generation method may include the following steps:
S100:获取指定标准字的特征描述向量C,特征描述向量C用于指示所述指定标准字的内容;S100: Obtain a feature description vector C of a specified standard word, and the feature description vector C is used to indicate the content of the specified standard word;
S200:利用所述特征描述向量C和指定的非标准特征向量S将指定标准字转换成目标样本,所述目标样本对应的风格与非标准特征向量S表示的风格相同。S200: Using the feature description vector C and the specified non-standard feature vector S to convert the specified standard word into a target sample, where the style corresponding to the target sample is the same as the style represented by the non-standard feature vector S.
本发明实施例的样本生成方法的执行主体可以是电子设备,更具体的可以是电子设备的处理器。电子设备例如可以是计算机设备或者嵌入式设备,具体类型不限,只要是具备数据处理能力即可。The execution body of the sample generation method in the embodiment of the present invention may be an electronic device, and more specifically, may be a processor of the electronic device. The electronic device may be, for example, a computer device or an embedded device, and the specific type is not limited, as long as it has data processing capability.
步骤S100中,获取指定标准字的特征描述向量C,特征描述向量C用于指示所述指定标准字的内容。In step S100, a feature description vector C of a specified standard word is obtained, and the feature description vector C is used to indicate the content of the specified standard word.
指定标准字的字体可以为宋体、黑体等,具体字体风格不限,只要是指定标准字的内容是样本所需的文字内容即可。获取指定标准字的特征描述向量C之前,可以先从相应字体的字库中获取该指定标准字。得到指定标准字之后,可对该指定标准字进行特征提取得到描述该指定标准字的内容的特征描述向量C。The font of the specified standard word can be Song, Hei, etc. The specific font style is not limited, as long as the content of the specified standard word is the text content required by the sample. Before acquiring the feature description vector C of the specified standard word, the specified standard word may be acquired from the font library of the corresponding font. After the specified standard word is obtained, feature extraction can be performed on the specified standard word to obtain a feature description vector C that describes the content of the specified standard word.
可以通过特征提取算法对指定标准字进行特征提取,特征提取算法具体不限,比如LBP特征提取算法、HOG特征提取算法、SIFT特征提取算子等,还可以采用深度学习的方式实现特征提取。Feature extraction can be performed on the specified standard words through a feature extraction algorithm. The feature extraction algorithm is not limited, such as LBP feature extraction algorithm, HOG feature extraction algorithm, SIFT feature extraction operator, etc., and deep learning can also be used to achieve feature extraction.
指定标准字是指定字体字库中的任一个标准字。通常来说,在计算机设备中会默认配置有常用字体字库、或者也可以从网络上下载常用字体字库,指定字库可以是常用字体字库中的任一个。以指定字体字库是宋体字库为例,宋体字库包含超过20000个宋体字,指定标准字可以是这20000多个宋体字中的任一个,根据样本所需的文字内容而定。如果针对20000多个宋体字中的每一个,采用本发明实施例来生成对应样本,则可以生成超过20000个所需风格的包含不同文字内容的样本。The specified standard word is any standard word in the specified font library. Generally speaking, a common font font library is configured by default in a computer device, or a common font font library can also be downloaded from the network, and the specified font library can be any one of the common font font libraries. Take the designated font library as an example of the Song-style font. The Song-style font library contains more than 20,000 Song-style characters, and the specified standard character can be any of the more than 20,000 Song-style characters, depending on the text content required by the sample. If a corresponding sample is generated for each of more than 20,000 Song-style characters by using the embodiment of the present invention, more than 20,000 samples containing different text contents in a desired style can be generated.
若指定字体字库中存在N1个标准字,则可以转换得到N1个所需风格的包含不同文字内容的样本,每个样本中的文字内容与标准字的内容相同但风格不同。因此,本发明实施例中,能够较为容易地生成多个所需风格的样本,可克服目前一些字体风格样本较少的问题,比如在书法作品中的字体风格等。If there are N1 standard characters in the specified font library, N1 samples containing different text contents in the required style can be converted and obtained, and the text contents in each sample are the same as the standard characters but in different styles. Therefore, in the embodiment of the present invention, it is possible to easily generate a plurality of samples of the required style, which can overcome the problem of few font style samples at present, such as font styles in calligraphy works.
步骤S200中,利用所述特征描述向量C和指定的非标准特征向量S将指定标准字转换成目标样本,所述目标样本对应的风格与非标准特征向量S表示的风格相同。In step S200, the specified standard word is converted into a target sample by using the feature description vector C and the specified non-standard feature vector S, and the style corresponding to the target sample is the same as the style represented by the non-standard feature vector S.
非标准特征向量S表示的风格(简称为目标风格)可以为一些常用或不常用的书法风格,比如可以包括黑体、柳宗元体、或米芾体等字体风格,甚至,目标风格还可以为某个人的手写字体。通常来说,每个人的书写风格都存在着一定的差异,每个人的书写风格都可作为目标风格。The style represented by the non-standard feature vector S (referred to as the target style) can be some commonly used or uncommon calligraphy styles, such as font styles such as Hei, Liu Zongyuan, or Mi Fu, and even, the target style can also be a certain person handwritten font. Generally speaking, there are some differences in everyone's writing style, and everyone's writing style can be used as a target style.
可以预设多种表示不同风格的非标准特征向量,非标准特征向量S是其中的一个。若预设的非标准特征向量的总数为N2,如果针对每个非标准特征向量均进行指定标准字的转换,最终可以生成N2种不同风格的包含同一文字内容的目标样本,样本更具多样性,可反映更多真实场景,用这些样本对神经网络训练,可使得神经网络文本识别结果更准确。A variety of non-standard feature vectors representing different styles can be preset, and the non-standard feature vector S is one of them. If the total number of preset non-standard feature vectors is N2, if the conversion of the specified standard words is performed for each non-standard feature vector, N2 different styles of target samples containing the same text content can finally be generated, and the samples are more diverse. , which can reflect more real scenes. Using these samples to train the neural network can make the neural network text recognition results more accurate.
各非标准特征向量的编码形式不限,比如可以依据风格总数N2来进行编码,可采用one-hot编码(独热码)方式对向量进行编码。The encoding form of each non-standard feature vector is not limited, for example, encoding can be performed according to the total number of styles N2, and one-hot encoding (one-hot encoding) can be used to encode the vector.
以one-hot编码为例,假设要生成的风格共N2种,如黑体、柳宗元体、米芾体等,当需要生成柳宗元体的样本时,非标准特征向量S中可仅将对应柳宗元体维度(第2维度)上的数值编码为1,其余维度上的数值编码为0,最终非标准特征向量S=[0,1,0…0];当需要生成米芾体的样本时,非标准特征向量S中可仅将对应米芾体维度(第3维度)上的数值编码为1,其余维度上的数值编码为0,最终非标准特征向量S=[0,0,1…0];其他风格也可以此类推。Taking one-hot encoding as an example, it is assumed that there are N2 styles to be generated, such as Hei, Liu Zong Yuan, Mi Fu, etc. When a sample of Liu Zong Yuan needs to be generated, in the non-standard feature vector S, only the dimension corresponding to Liu Zong Yuan can be used. The value on the (2nd dimension) is coded as 1, and the value on the other dimensions is coded as 0, and the final non-standard feature vector S=[0,1,0…0]; when it is necessary to generate samples of Mi Fu body, non-standard In the feature vector S, only the numerical value on the corresponding Mi Fu body dimension (the third dimension) can be coded as 1, and the numerical value on the other dimensions can be coded as 0, and the final non-standard feature vector S=[0,0,1...0]; The same can be said for other styles.
结合前述内容而言,在指定字体字库中存在N1个标准字,并编码有N2个非标准特征向量的情况下,总共可生成的目标样本数量为N1与N2的乘积,其中,所有目标样本的风格总数为N2,每种风格有N1个目标样本且每个目标样本包含的文字内容不同。Combined with the foregoing, in the case where there are N1 standard words in the specified font font library and N2 non-standard feature vectors are encoded, the total number of target samples that can be generated is the product of N1 and N2, where the total number of target samples is the product of N1 and N2. The total number of styles is N2, each style has N1 target samples and each target sample contains different text content.
本发明实施例中,可利用用于指示指定标准字的内容的特征描述向量C、及表示某种风格的非标准特征向量S,将指定标准字转换成该风格的目标样本,无需采集该风格下的字符图像来合成样本,提升了样本的生成效率,而且可根据需要生成各种风格下的包含不同文字内容的样本,实现样本的多样性。In this embodiment of the present invention, a feature description vector C used to indicate the content of a specified standard word and a non-standard feature vector S representing a certain style can be used to convert a specified standard word into a target sample of the style without collecting the style. It can improve the efficiency of sample generation, and can generate samples containing different text content in various styles according to needs, so as to realize the diversity of samples.
在一个实施例中,上述方法流程可由样本生成装置100执行,如图2所示,样本生成装置100可以包含2个模块:特征描述向量获取模块101、目标样本生成模块102。特征描述向量获取模块101用于执行上述步骤S100,目标样本生成模块102用于执行上述步骤S200。In one embodiment, the above method flow can be executed by the
在一个实施例中,步骤S100中,所述获取指定标准字的特征描述向量C,包括:In one embodiment, in step S100, the acquiring the feature description vector C of the specified standard word includes:
将包含有所述指定标准字的第一图像输入至已训练的学生网络中的第一神经网络,以由所述第一神经网络对输入的所述第一图像进行特征提取得到特征描述向量C。The first image containing the specified standard word is input to the first neural network in the trained student network, so that the first neural network performs feature extraction on the inputted first image to obtain a feature description vector C .
学生网络是预先训练好的,可以预存在电子设备中、或者存储在外部设备中,在需要执行上述方法时电子设备调用该学生网络中的第一神经网络。The student network is pre-trained and can be pre-stored in the electronic device or stored in an external device, and the electronic device calls the first neural network in the student network when the above method needs to be executed.
第一图像可以是通过采集真实场景中的指定标准字得到的,也可以是通过指定字体字库中的指定标准字经格式转换得到的,具体方式不限。第一图像可预设在电子设备中,在执行时从电子设备中获取第一图像。The first image may be obtained by collecting designated standard words in a real scene, or may be obtained by format conversion of designated standard words in a designated font library, and the specific manner is not limited. The first image may be preset in the electronic device, and the first image is acquired from the electronic device during execution.
宋体字库中存在20000多个宋体字(格式为ttf),可以依据宋体字库中已有的宋体字获取第一图像。比如,可以将宋体字库中的宋体字从ttf格式直接转换为图像格式得到第一图像;或者,可将宋体字库中宋体字与背景数据(比如表示白色背景的背景数据)融合生成第一图像。There are more than 20,000 Song-style characters in the Song-style font library (the format is ttf), and the first image can be obtained according to the Song-style fonts existing in the Song-style font library. For example, the first image can be obtained by directly converting the Song-style characters in the Song-style font library from the ttf format to the image format; or, the first image can be generated by merging the Song-style characters in the Song-style font library with background data (such as background data representing a white background).
将第一图像输入至第一神经网络,第一神经网络对该第一图像进行特征提取后,可得到指定标准字的特征描述向量C。第一神经网络的这一功能可通过训练而具备。The first image is input to the first neural network, and after the first neural network performs feature extraction on the first image, a feature description vector C of the specified standard word can be obtained. This function of the first neural network can be provided by training.
具体的,参看图3,第一图像比如是大小为64*64的图像,第一图像中的指定标准字比如为宋体的“睛”,第一神经网络对第一图像进行特征提取得到指示“睛”的512维的特征描述向量C。Specifically, referring to FIG. 3 , the first image is, for example, an image with a size of 64*64, the specified standard word in the first image is, for example, "eye" in Song Dynasty, and the first neural network performs feature extraction on the first image to obtain the indication "" The 512-dimensional feature description vector C of "eye".
在一个实施例中,所述第一神经网络对输入的所述第一图像进行特征提取得到特征描述向量C,包括:In one embodiment, the first neural network performs feature extraction on the inputted first image to obtain a feature description vector C, including:
所述第一神经网络至少通过用于执行卷积处理的卷积层、及用于执行非线性变换处理的第一非线性变换层对所述第一图像进行特征提取得到特征描述向量C。The first neural network performs feature extraction on the first image through at least a convolution layer for performing convolution processing and a first nonlinear transformation layer for performing nonlinear transformation processing to obtain a feature description vector C.
第一神经网络可以包括多层卷积层,卷积层执行的是卷积操作,可以对第一图像进行特征提取得到特征描述向量,并将特征描述向量输出至第一非线性变换层。第一非线性变换层可以增强神经网络的拟合能力,第一非线性变换层输出拟合后的特征描述向量作为特征描述向量。当然,第一神经网络中的层结构也不限于此,还可以包括其他层比如池化层(Pooling),池化层是一种特殊的下采样层,即对卷积得到的特征描述向量进行降维。The first neural network may include a multi-layer convolution layer, and the convolution layer performs a convolution operation, and may perform feature extraction on the first image to obtain a feature description vector, and output the feature description vector to the first nonlinear transformation layer. The first nonlinear transformation layer can enhance the fitting ability of the neural network, and the first nonlinear transformation layer outputs the fitted feature description vector as the feature description vector. Of course, the layer structure in the first neural network is not limited to this, and can also include other layers such as pooling layer. Pooling layer is a special downsampling layer, that is, the feature description vector obtained by convolution is processed. Dimensionality reduction.
第一神经网络比如可以采用VGG、Inception、ResNet等卷积神经网络架构来实现,具体不限于此。卷积神经网络是一种前馈的神经网络,其神经元可以响应有限覆盖范围内周围单元,并通过权值共享和特征汇聚,有效提取图像的结构信息。For example, the first neural network can be implemented by using a convolutional neural network architecture such as VGG, Inception, and ResNet, but is not limited to this. Convolutional neural network is a feedforward neural network whose neurons can respond to surrounding units within a limited coverage area, and effectively extract the structural information of images through weight sharing and feature aggregation.
在一个实施例中,步骤S200中,利用所述特征描述向量C和指定的非标准特征向量S将指定标准字转换成目标样本,包括:In one embodiment, in step S200, using the feature description vector C and the specified non-standard feature vector S to convert a specified standard word into a target sample, including:
S201:将所述特征描述向量C与所述非标准特征向量S输入至已训练的学生网络中的第二神经网络,以由所述第二神经网络将所述特征描述向量C与所述非标准特征向量S进行融合得到融合向量T,利用所述融合向量T生成第二图像;S201 : Input the feature description vector C and the non-standard feature vector S to a second neural network in the trained student network, so that the second neural network combines the feature description vector C with the non-standard feature vector S The standard feature vector S is fused to obtain a fusion vector T, and the fusion vector T is used to generate a second image;
S202:将所述第二图像确定为所述目标样本。S202: Determine the second image as the target sample.
学生网络是预先训练好的,可以预存在电子设备中或者存储在外部设备中,在需要执行上述方法时电子设备再调用该学生网络中的第二神经网络。The student network is pre-trained and can be pre-stored in the electronic device or stored in an external device, and the electronic device calls the second neural network in the student network when the above method needs to be executed.
将特征描述向量C与非标准特征向量S输入至第二神经网络后,第二神经网络会将输入的特征描述向量C与所述非标准特征向量S进行融合得到融合向量T,利用所述融合向量T生成第二图像。该第二图像作为目标样本,其风格与非标准特征向量S一致且包含的文字内容与指定标准字的内容一致。After the feature description vector C and the non-standard feature vector S are input into the second neural network, the second neural network will fuse the input feature description vector C and the non-standard feature vector S to obtain a fusion vector T, and use the fusion vector T. The vector T generates the second image. The second image is used as a target sample, and its style is consistent with the non-standard feature vector S and the contained text content is consistent with the content of the specified standard word.
基于对第二神经网络的训练,可以通过输入不同的非标准特征向量来指定第二图像对应的风格,可适用于生成不同字体风格的样本。比如,输入的非标准特征向量S表示的风格为柳宗元体,那么生成的第二图像的风格为柳宗元体;输入的非标准特征向量S表示的风格为米芾体,那么生成的第二图像的风格为米芾体,等等。Based on the training of the second neural network, the style corresponding to the second image can be specified by inputting different non-standard feature vectors, which can be applied to generate samples of different font styles. For example, if the style represented by the input non-standard feature vector S is Liu Zong Yuan body, then the style of the generated second image is Liu Zong Yuan body; the style represented by the input non-standard feature vector S is Mi Fu body, then the style of the generated second image is The style is Mi Fu body, etc.
继续参看图3,第一神经网络对第一图像进行特征提取得到指示“睛”的512维的特征描述向量C,并将特征描述向量C输入到第二神经网络中,一并将非标准特征向量S输入到第二神经网络。非标准特征向量S表示的风格比如为指定风格风格,非标准特征向量S的维度比如为100维。第二神经网络将输入的512维的特征描述向量C和100维的非标准特征向量S融合为一个融合向量T,并利用融合得到的融合向量T生成第二图像,第二图像比如是大小为64*64的图像,第二图像中包含指定风格风格的“睛”,该第二图像作为目标样本。步骤S201中,第二神经网络将所述特征描述向量C与所述非标准特征向量S进行融合得到融合向量T的实现方式不止一种,比如包括以下三种实现方式:Continuing to refer to Fig. 3, the first neural network performs feature extraction on the first image to obtain a 512-dimensional feature description vector C indicating "eye", and the feature description vector C is input into the second neural network. The vector S is input to the second neural network. The style represented by the non-standard feature vector S is, for example, a specified style, and the dimension of the non-standard feature vector S is, for example, 100 dimensions. The second neural network fuses the input 512-dimensional feature description vector C and 100-dimensional non-standard feature vector S into a fusion vector T, and uses the fusion vector T obtained by fusion to generate a second image, for example, the size of the second image is A 64*64 image, the second image contains the "eye" of the specified style, and the second image is used as the target sample. In step S201, the second neural network fuses the feature description vector C and the non-standard feature vector S to obtain a fusion vector T in more than one implementation manner, for example, including the following three implementation manners:
第一种实现方式中,所述第二神经网络包括融合层;所述特征描述向量C与所述非标准特征向量S的维度相同;In a first implementation manner, the second neural network includes a fusion layer; the feature description vector C has the same dimension as the non-standard feature vector S;
所述第二神经网络将所述特征描述向量C与所述非标准特征向量S进行融合得到融合向量T,包括:The second neural network fuses the feature description vector C and the non-standard feature vector S to obtain a fusion vector T, including:
所述第二神经网络利用融合层将所述特征描述向量C与所述非标准特征向量S执行叠加处理得到所述融合向量T。The second neural network uses a fusion layer to perform superposition processing on the feature description vector C and the non-standard feature vector S to obtain the fusion vector T.
该方式中,融合层是用于执行向量叠加处理的计算层,可将特征描述向量C与非标准特征向量S执行叠加处理得到融合向量T。In this method, the fusion layer is a computing layer for performing vector superposition processing, and the fusion vector T can be obtained by performing superposition processing on the feature description vector C and the non-standard feature vector S.
叠加处理的方式可以为加权叠加处理,将特征描述向量C与非标准特征向量S在每一维度上的数值对应地加权求和。比如,C=(a1,a2,a3,……,a512),S=(b1,b2,b3,……,b512),加权叠加处理后的融合向量T=(a1*x1+b1*y1,a2*x2+b2*y2,a3*x3+b3*y3,……,a512*x512+b512*y512),其中,(x1,x2,x3,……,x512)为特征描述向量C在各维度上的数值加权时的权重系数,(y1,y2,y3,……,y512)为非标准特征向量S在各维度上的数值加权时的权重系数。The superposition process may be a weighted superposition process, in which the values of the feature description vector C and the non-standard feature vector S in each dimension are correspondingly weighted and summed. For example, C=(a1, a2, a3,..., a512), S=(b1, b2, b3,..., b512), the fusion vector T=(a1*x1+b1*y1, a2*x2+b2*y2, a3*x3+b3*y3,...,a512*x512+b512*y512), where (x1, x2, x3,...,x512) is the feature description vector C in each dimension (y1, y2, y3, .
第二种实现方式中,所述第二神经网络包括全连接层和融合层;所述特征描述向量C与所述非标准特征向量S的维度不同;In the second implementation manner, the second neural network includes a fully connected layer and a fusion layer; the dimension of the feature description vector C is different from that of the non-standard feature vector S;
所述第二神经网络将所述特征描述向量C与所述非标准特征向量S进行融合得到融合向量T,包括:The second neural network fuses the feature description vector C and the non-standard feature vector S to obtain a fusion vector T, including:
所述第二神经网络利用全连接层将所述非标准特征向量S映射为维度与所述特征描述向量C的维度相同的参考向量K;The second neural network utilizes a fully connected layer to map the non-standard feature vector S to a reference vector K having the same dimension as the feature description vector C;
所述第二神经网络利用融合层将所述特征描述向量C与参考向量K执行叠加处理得到所述融合向量T。The second neural network uses a fusion layer to perform superposition processing on the feature description vector C and the reference vector K to obtain the fusion vector T.
全连接层是用于执行向量维度映射的计算层。比如,非标准特征向量S的维度为100维,而特征描述向量C的维度为512维,通过全连接层可将非标准特征向量S映射为维度为512维的参考向量K,实现维度的扩展。融合层是用于执行向量叠加处理的计算层,可将特征描述向量C与参考向量K执行叠加处理得到融合向量T,叠加处理的方式与第一种实现方式中类似,在此不再赘述。A fully connected layer is a computational layer that performs vector dimension mapping. For example, the dimension of the non-standard feature vector S is 100 dimensions, while the dimension of the feature description vector C is 512 dimensions. Through the full connection layer, the non-standard feature vector S can be mapped to a reference vector K with a dimension of 512 dimensions to achieve dimension expansion. . The fusion layer is a computing layer for performing vector superposition processing. The feature description vector C and the reference vector K can be superimposed to obtain a fusion vector T. The superposition processing method is similar to that in the first implementation, and will not be repeated here.
第三种实现方式中,所述第二神经网络包括融合层;In a third implementation manner, the second neural network includes a fusion layer;
所述第二神经网络将所述特征描述向量C与所述非标准特征向量S进行融合得到融合向量T,包括:The second neural network fuses the feature description vector C and the non-standard feature vector S to obtain a fusion vector T, including:
所述第二神经网络利用融合层将所述特征描述向量C与所述非标准特征向量S进行合并得到所述融合向量T。The second neural network uses a fusion layer to combine the feature description vector C and the non-standard feature vector S to obtain the fusion vector T.
该实现方式尤其适合于特征描述向量C与所述非标准特征向量S的维度不同的情况,当然,在维度相同情况下也是适用的。This implementation is especially suitable for the case where the dimension of the feature description vector C and the non-standard feature vector S are different, of course, it is also applicable in the case of the same dimension.
该方式中,融合层是用于执行向量合并处理的计算层,可将特征描述向量C与非标准特征向量S执行合并处理得到一个新的行融合向量T。In this method, the fusion layer is a computing layer for performing vector merging processing, and a new row fusion vector T can be obtained by performing merging processing on the feature description vector C and the non-standard feature vector S.
向量的合并是对两个向量在维度上的拼接,合并后的向量维度为两个需向量的维度总和。比如,C=(a1,a2,a3,……,a512),S=(b1,b2,b3,……,b100),合并后的T=(a1,a2,a3,……,a512,b1,b2,b3,……,b100)。The merging of vectors is the splicing of two vectors in dimension, and the dimension of the merged vector is the sum of the dimensions of the two required vectors. For example, C=(a1, a2, a3,..., a512), S=(b1, b2, b3,..., b100), the combined T=(a1, a2, a3,..., a512, b1 , b2, b3, ..., b100).
在一个实施例中,所述第二神经网络还包括:用于执行反卷积处理的反卷积层、及用于执行非线性变换的第二非线性变换层;In one embodiment, the second neural network further includes: a deconvolution layer for performing deconvolution processing, and a second nonlinear transformation layer for performing nonlinear transformation;
所述第二神经网络利用所述融合向量T生成第二图像包括:The second neural network using the fusion vector T to generate the second image includes:
所述第二神经网络利用所述反卷积层、第二非线性变换层生成与所述融合向量T对应的第二图像。The second neural network generates a second image corresponding to the fusion vector T by using the deconvolution layer and the second nonlinear transformation layer.
第二神经网络可以包括多层反卷积层,反卷积层执行的是反卷积操作,可以利用融合向量T生成第二图像,并将第二图像输出至第二非线性变换层。第二非线性变换层同样可以增强神经网络的拟合能力,第二非线性变换层输出拟合后的第二图像。当然,第二神经网络的层结构也不限于此,还可以包括其他层比如全连接层等,该全连接层可以实现维度的映射,比如将输入向量的维度映射为更高维度的向量,该全连接层也可以用卷积层来替换。The second neural network may include a multi-layer deconvolution layer, the deconvolution layer performs a deconvolution operation, and may use the fusion vector T to generate a second image, and output the second image to the second nonlinear transformation layer. The second nonlinear transformation layer can also enhance the fitting ability of the neural network, and the second nonlinear transformation layer outputs the fitted second image. Of course, the layer structure of the second neural network is not limited to this, and may also include other layers such as a fully connected layer. Fully connected layers can also be replaced with convolutional layers.
在一个实施例中,所述学生网络是在已训练的教师网络监督下训练得到的;In one embodiment, the student network is trained under the supervision of a trained teacher network;
所述第二神经网络中至少一层的网络参数应用了所述教师网络中对应层的网络参数。The network parameters of at least one layer in the second neural network apply the network parameters of the corresponding layer in the teacher network.
通过训练一个教师网络,进而通过教师网络监督第一神经网络与第二神经网络的训练。本实施例的训练方式中,将第一神经网络与第二神经网络的连接结构作为一个学生网络。By training a teacher network, the teacher network supervises the training of the first neural network and the second neural network. In the training method of this embodiment, the connection structure of the first neural network and the second neural network is regarded as a student network.
参看图4,训练学生网络之前,还需先训练一个教师网络,教师网络包括第一神经网络A1和第二神经网络A2。其中,第一神经网络A1与学生网络中的第一神经网络的层结构可以是相同的;第二神经网络A2与学生网络中的第二神经网络的层结构可以是类似的,只是不需要执行向量的融合,因而可以省略融合层。训练分为两步:Referring to Figure 4, before training the student network, a teacher network needs to be trained first, and the teacher network includes a first neural network A1 and a second neural network A2. The layer structures of the first neural network A1 and the first neural network in the student network may be the same; the layer structures of the second neural network A2 and the second neural network in the student network may be similar, but do not need to be executed Fusion of vectors, so the fusion layer can be omitted. The training is divided into two steps:
首先,将包含指定风格文字(非宋体字)“睛”的样本(可从真实场景中采集)作为教师网络的输入和输出,训练该教师网络,训练完教师网络后可得到第一神经网络A1和第二神经网络A2中各层的网络参数;First, the samples containing the specified style characters (non-Song-style characters) "eye" (which can be collected from the real scene) are used as the input and output of the teacher network, and the teacher network is trained. After training the teacher network, the first neural network A1 and The network parameters of each layer in the second neural network A2;
接着,将已训练的教师网络中第二神经网络A2的某一层或几层的网络参数作为学生网络中第二神经网络对应层的网络参数,再将包含宋体字“睛”的样本作为学生网络的输入、将表示指定风格的非标准特征向量S作为学生网络中第二神经网络的输入、及将包含指定风格文字“睛”的样本作为学生网络的输出(第一神经网络所得的特征描述向量会输入至第二神经网络中),训练学生网络,完成对学生网络的训练。Next, take the network parameters of a certain layer or layers of the second neural network A2 in the trained teacher network as the network parameters of the corresponding layer of the second neural network in the student network, and then take the sample containing the Chinese character "eye" as the student network The input, the non-standard feature vector S representing the specified style is used as the input of the second neural network in the student network, and the sample containing the specified style character "eye" is used as the output of the student network (the feature description vector obtained by the first neural network). will be input to the second neural network), train the student network, and complete the training of the student network.
下面再提供一下对第一神经网络和第二神经网络进行训练的方式。The following provides a method for training the first neural network and the second neural network.
通过训练一个能够区分生成样本和真实样本的分类器,进而通过分类器来监督第一神经网络与第二神经网络的训练。为了简洁描述,该训练方式中,将第一神经网络与第二神经网络的连接结构称为一个神经网络EG。The training of the first neural network and the second neural network is supervised by training a classifier capable of distinguishing the generated samples and the real samples. For the sake of concise description, in this training method, the connection structure of the first neural network and the second neural network is called a neural network EG.
参看图5,真实样本为从真实场景中采集的用于训练神经网络EG的样本,比如是真实场景中采集的包含指定风格文字(非宋体字)的图像,图中示出的为包含指定风格文字“睛”的图像,实际训练过程中可选用较多包含不同指定风格文字的真实样本进行训练,即可得到较佳的网络参数,训练过程分为两步:Referring to Fig. 5, the real sample is a sample collected from a real scene for training the neural network EG, such as an image collected in a real scene that contains a specified style of text (non-Song font), and shown in the figure is a specified style of text. "Eye" images, in the actual training process, you can use more real samples containing different specified styles of text for training, you can get better network parameters, the training process is divided into two steps:
首先,将包含宋体字“睛”的样本输入到神经网络得到包含指定风格文字“睛”的生成样本,将包含指定风格文字“睛”的真实样本和表示指定风格的非标准特征向量S作为一组输入数据,将神经网络EG的生成样本和表示指定风格的非标准特征向量S作为另一组输入数据,将两组输入数据分别输入至分类器中,训练分类器使其能够区分出生成样本与真实样本,并能够计算出生成样本与真实样本的偏差,完成对分类器的训练;First, input the sample containing the Song-style character "eye" into the neural network to obtain the generated sample containing the specified style character "eye", and take the real sample containing the specified style character "eye" and the non-standard feature vector S representing the specified style as a set Input data, take the generated samples of the neural network EG and the non-standard feature vector S representing the specified style as another set of input data, respectively input the two sets of input data into the classifier, and train the classifier so that it can distinguish the generated samples from Real samples, and can calculate the deviation between the generated samples and the real samples, and complete the training of the classifier;
接着,通过已训练的分类器来监督神经网络EG的网络参数的训练,将包含宋体字“睛”的样本输入到神经网络EG得到包含指定风格文字“睛”的生成样本,将该生成样本和表示指定风格的非标准特征向量S输入到分类器中,分类器计算出该生成样本与相应真实样本之间的偏差后,用损失函数计算偏差对应的损失值,依据损失值调整神经网络EG的网络参数,神经网络EG的网络参数的训练,将包含宋体字“睛”的样本输入到神经网络EG得到包含指定风格文字“睛”的生成样本的步骤继续训练神经网络EG,直至损失函数的损失值降低至合理范围内,完成神经网络EG的训练。Next, the training of the network parameters of the neural network EG is supervised by the trained classifier, and the samples containing the Song-style character "eye" are input into the neural network EG to obtain a generated sample containing the specified style character "eye", and the generated sample and the representation The non-standard feature vector S of the specified style is input into the classifier. After the classifier calculates the deviation between the generated sample and the corresponding real sample, the loss function is used to calculate the loss value corresponding to the deviation, and the network of the neural network EG is adjusted according to the loss value. Parameters, the training of the network parameters of the neural network EG, the steps of inputting the samples containing the Song-style character "eye" into the neural network EG to obtain the generated samples containing the specified style character "eye" Continue to train the neural network EG until the loss value of the loss function decreases. To a reasonable range, complete the training of the neural network EG.
其中,上述训练过程中,将包含宋体字“睛”的样本输入到神经网络EG得到包含指定风格文字“睛”的生成样本,包括:将包含宋体字“睛”的样本输入到第一神经网络中,由第一神经网络对输入的样本进行特征提取后得到表征宋体字“睛”的内容的特征描述向量C并将C输入到第二神经网络;将表示指定风格的非标准特征向量S也输入到第二神经网络中,由第二神经网络将所述特征描述向量C与所述非标准特征向量S进行融合并利用融合所得向量生成图像,所生成的图像即作为包含指定风格文字“睛”的生成样本。Wherein, in the above-mentioned training process, inputting the sample containing the Song-style character "eye" into the neural network EG to obtain a generated sample containing the designated style character "eye", including: inputting the sample containing the Song-style character "eye" into the first neural network, The first neural network performs feature extraction on the input samples to obtain a feature description vector C representing the content of the Song-style character "eye" and input C into the second neural network; the non-standard feature vector S representing the specified style is also input into the first neural network. In the second neural network, the second neural network fuses the feature description vector C with the non-standard feature vector S and uses the resulting vector to generate an image, and the generated image is used as a generated image containing the specified style of text "eye". sample.
上述两种训练方式中,第一神经网络和第二神经网络是一同进行训练的。当然,也可以分开训练第一神经网络与第二神经网络。In the above two training methods, the first neural network and the second neural network are trained together. Of course, the first neural network and the second neural network can also be trained separately.
本发明还提供一种样本生成装置,参看图2,该样本生成装置100包括:The present invention also provides a sample generation device, referring to FIG. 2 , the
特征描述向量获取模块101,用于获取指定标准字的特征描述向量C,特征描述向量C用于指示所述指定标准字的内容;The feature description
目标样本生成模块102,用于利用所述特征描述向量C和指定的非标准特征向量S将指定标准字转换成目标样本,所述目标样本对应的风格与非标准特征向量S表示的风格相同。The target
在一个实施例中,所述特征描述向量获取模块具体用于:In one embodiment, the feature description vector obtaining module is specifically used for:
将包含有所述指定标准字的第一图像输入至已训练的学生网络中的第一神经网络,以由所述第一神经网络对输入的所述第一图像进行特征提取得到特征描述向量C。The first image containing the specified standard word is input to the first neural network in the trained student network, so that the first neural network performs feature extraction on the inputted first image to obtain a feature description vector C .
在一个实施例中,所述第一神经网络对输入的所述第一图像进行特征提取得到特征描述向量C,包括:In one embodiment, the first neural network performs feature extraction on the inputted first image to obtain a feature description vector C, including:
所述第一神经网络至少通过用于执行卷积处理的卷积层、及用于执行非线性变换处理的第一非线性变换层对所述第一图像进行特征提取得到特征描述向量C。The first neural network performs feature extraction on the first image through at least a convolution layer for performing convolution processing and a first nonlinear transformation layer for performing nonlinear transformation processing to obtain a feature description vector C.
在一个实施例中,所述目标样本生成模块包括:In one embodiment, the target sample generation module includes:
图像生成单元,用于将所述特征描述向量C与所述非标准特征向量S输入至已训练的学生网络中的第二神经网络,以由所述第二神经网络将所述特征描述向量C与所述非标准特征向量S进行融合得到融合向量T,利用所述融合向量T生成第二图像;The image generation unit is used to input the feature description vector C and the non-standard feature vector S to the second neural network in the trained student network, so that the feature description vector C is converted by the second neural network Perform fusion with the non-standard feature vector S to obtain a fusion vector T, and utilize the fusion vector T to generate a second image;
目标样本确定单元,用于将所述第二图像确定为所述目标样本。A target sample determination unit, configured to determine the second image as the target sample.
在一个实施例中,所述第二神经网络包括融合层;所述特征描述向量C与所述非标准特征向量S的维度相同;In one embodiment, the second neural network includes a fusion layer; the feature description vector C has the same dimension as the non-standard feature vector S;
所述第二神经网络将所述特征描述向量C与所述非标准特征向量S进行融合得到融合向量T时,具体用于:When the second neural network fuses the feature description vector C and the non-standard feature vector S to obtain a fusion vector T, it is specifically used for:
所述第二神经网络利用融合层将所述特征描述向量C与所述非标准特征向量S执行叠加处理得到所述融合向量T。The second neural network uses a fusion layer to perform superposition processing on the feature description vector C and the non-standard feature vector S to obtain the fusion vector T.
在一个实施例中,所述第二神经网络包括全连接层和融合层;所述特征描述向量C与所述非标准特征向量S的维度不同;In one embodiment, the second neural network includes a fully connected layer and a fusion layer; the dimension of the feature description vector C is different from the dimension of the non-standard feature vector S;
所述第二神经网络将所述特征描述向量C与所述非标准特征向量S进行融合得到融合向量T时,具体用于:When the second neural network fuses the feature description vector C and the non-standard feature vector S to obtain a fusion vector T, it is specifically used for:
所述第二神经网络利用全连接层将所述非标准特征向量S映射为维度与所述特征描述向量C的维度相同的参考向量K;The second neural network utilizes a fully connected layer to map the non-standard feature vector S to a reference vector K having the same dimension as the feature description vector C;
所述第二神经网络利用融合层将所述特征描述向量C与参考向量K执行叠加处理得到所述融合向量T。The second neural network uses a fusion layer to perform superposition processing on the feature description vector C and the reference vector K to obtain the fusion vector T.
在一个实施例中,所述第二神经网络包括融合层;In one embodiment, the second neural network includes a fusion layer;
所述第二神经网络将所述特征描述向量C与所述非标准特征向量S进行融合得到融合向量T时,具体用于:When the second neural network fuses the feature description vector C and the non-standard feature vector S to obtain a fusion vector T, it is specifically used for:
所述第二神经网络利用融合层将所述特征描述向量C与所述非标准特征向量S进行合并得到所述融合向量T。The second neural network uses a fusion layer to combine the feature description vector C and the non-standard feature vector S to obtain the fusion vector T.
在一个实施例中,所述第二神经网络还包括:用于执行反卷积处理的反卷积层、及用于执行非线性变换的第二非线性变换层;In one embodiment, the second neural network further includes: a deconvolution layer for performing deconvolution processing, and a second nonlinear transformation layer for performing nonlinear transformation;
所述第二神经网络利用所述融合向量T生成第二图像时,具体用于:When the second neural network uses the fusion vector T to generate the second image, it is specifically used for:
所述第二神经网络利用所述反卷积层、第二非线性变换层生成与所述融合向量T对应的第二图像。The second neural network generates a second image corresponding to the fusion vector T by using the deconvolution layer and the second nonlinear transformation layer.
在一个实施例中,所述学生网络是在已训练的教师网络监督下训练得到的;In one embodiment, the student network is trained under the supervision of a trained teacher network;
所述第二神经网络中至少一层的网络参数应用了所述教师网络中对应层的网络参数。The network parameters of at least one layer in the second neural network apply the network parameters of the corresponding layer in the teacher network.
上述装置中各个单元的功能和作用的实现过程具体详见上述方法中对应步骤的实现过程,在此不再赘述。For details of the implementation process of the functions and functions of each unit in the above device, please refer to the implementation process of the corresponding steps in the above method, which will not be repeated here.
对于装置实施例而言,由于其基本对应于方法实施例,所以相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元。For the apparatus embodiments, since they basically correspond to the method embodiments, reference may be made to the partial descriptions of the method embodiments for related parts. The device embodiments described above are only illustrative, wherein the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units.
本发明还提供一种电子设备,包括处理器及存储器;所述存储器存储有可被处理器调用的程序;其中,所述处理器执行所述程序时,实现如前述实施例中所述的样本生成方法。The present invention also provides an electronic device, including a processor and a memory; the memory stores a program that can be called by the processor; wherein, when the processor executes the program, the sample as described in the foregoing embodiments is implemented generate method.
本发明样本生成装置的实施例可以应用在电子设备上。以软件实现为例,作为一个逻辑意义上的装置,是通过其所在电子设备的处理器将非易失性存储器中对应的计算机程序指令读取到内存中运行形成的。从硬件层面而言,如图6所示,图6是本发明根据一示例性实施例示出的样本生成装置100所在电子设备的一种硬件结构图,除了图6所示的处理器510、内存530、接口520、以及非易失性存储器540之外,实施例中装置100所在的电子设备通常根据该电子设备的实际功能,还可以包括其他硬件,对此不再赘述。Embodiments of the sample generating apparatus of the present invention can be applied to electronic equipment. Taking software implementation as an example, a device in a logical sense is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory for operation by the processor of the electronic device where the device is located. From a hardware perspective, as shown in FIG. 6 , FIG. 6 is a hardware structure diagram of an electronic device where the
本发明还提供一种机器可读存储介质,其上存储有程序,该程序被处理器执行时,实现如前述实施例中任意一项所述的样本生成方法。The present invention also provides a machine-readable storage medium on which a program is stored, and when the program is executed by a processor, implements the sample generation method described in any one of the foregoing embodiments.
本发明可采用在一个或多个其中包含有程序代码的存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。机器可读存储介质包括永久性和非永久性、可移动和非可移动媒体,可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。机器可读存储介质的例子包括但不限于:相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。The present invention may take the form of a computer program product embodied on one or more storage media having program code embodied therein, including but not limited to disk storage, CD-ROM, optical storage, and the like. Machine-readable storage media includes both persistent and non-permanent, removable and non-removable media, and storage of information can be implemented by any method or technology. Information may be computer readable instructions, data structures, modules of programs, or other data. Examples of machine-readable storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage , magnetic tape cartridges, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明保护的范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included in the present invention. within the scope of protection.
Claims (11)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910233792.XA CN111753859B (en) | 2019-03-26 | 2019-03-26 | Sample generation method, device and equipment |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910233792.XA CN111753859B (en) | 2019-03-26 | 2019-03-26 | Sample generation method, device and equipment |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN111753859A true CN111753859A (en) | 2020-10-09 |
| CN111753859B CN111753859B (en) | 2024-03-26 |
Family
ID=72671425
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201910233792.XA Active CN111753859B (en) | 2019-03-26 | 2019-03-26 | Sample generation method, device and equipment |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN111753859B (en) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112417959A (en) * | 2020-10-19 | 2021-02-26 | 上海臣星软件技术有限公司 | Picture generation method and device, electronic equipment and computer storage medium |
| CN113695058A (en) * | 2021-10-28 | 2021-11-26 | 南通金驰机电有限公司 | Self-protection method of intelligent waste crushing device for heat exchanger production |
| CN114818605A (en) * | 2022-04-28 | 2022-07-29 | 杭州网易云音乐科技有限公司 | Font generation and text display method, apparatus, medium and computing device |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20170098153A1 (en) * | 2015-10-02 | 2017-04-06 | Baidu Usa Llc | Intelligent image captioning |
| JP2018132855A (en) * | 2017-02-14 | 2018-08-23 | 国立大学法人電気通信大学 | Image style conversion apparatus, image style conversion method and image style conversion program |
| CN108664996A (en) * | 2018-04-19 | 2018-10-16 | 厦门大学 | A kind of ancient writing recognition methods and system based on deep learning |
| CN109064522A (en) * | 2018-08-03 | 2018-12-21 | 厦门大学 | The Chinese character style generation method of confrontation network is generated based on condition |
| CN109165376A (en) * | 2018-06-28 | 2019-01-08 | 西交利物浦大学 | Style character generating method based on a small amount of sample |
-
2019
- 2019-03-26 CN CN201910233792.XA patent/CN111753859B/en active Active
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20170098153A1 (en) * | 2015-10-02 | 2017-04-06 | Baidu Usa Llc | Intelligent image captioning |
| JP2018132855A (en) * | 2017-02-14 | 2018-08-23 | 国立大学法人電気通信大学 | Image style conversion apparatus, image style conversion method and image style conversion program |
| CN108664996A (en) * | 2018-04-19 | 2018-10-16 | 厦门大学 | A kind of ancient writing recognition methods and system based on deep learning |
| CN109165376A (en) * | 2018-06-28 | 2019-01-08 | 西交利物浦大学 | Style character generating method based on a small amount of sample |
| CN109064522A (en) * | 2018-08-03 | 2018-12-21 | 厦门大学 | The Chinese character style generation method of confrontation network is generated based on condition |
Non-Patent Citations (2)
| Title |
|---|
| ANGELINE AGUINALDO等: "Compressing GANs using Knowledge Distillation", ARXIV:1902.00159V1 [CS.CV], pages 38 - 39 * |
| 徐杨;: "基于隐式马尔可夫模型的遗传类比学习在中国书法生成中的应用", 武汉大学学报(理学版), no. 01, 29 February 2008 (2008-02-29), pages 90 - 94 * |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112417959A (en) * | 2020-10-19 | 2021-02-26 | 上海臣星软件技术有限公司 | Picture generation method and device, electronic equipment and computer storage medium |
| CN113695058A (en) * | 2021-10-28 | 2021-11-26 | 南通金驰机电有限公司 | Self-protection method of intelligent waste crushing device for heat exchanger production |
| CN113695058B (en) * | 2021-10-28 | 2022-03-15 | 南通金驰机电有限公司 | Self-protection method of intelligent waste crushing device for heat exchanger production |
| CN114818605A (en) * | 2022-04-28 | 2022-07-29 | 杭州网易云音乐科技有限公司 | Font generation and text display method, apparatus, medium and computing device |
Also Published As
| Publication number | Publication date |
|---|---|
| CN111753859B (en) | 2024-03-26 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| KR102245220B1 (en) | Apparatus for reconstructing 3d model from 2d images based on deep-learning and method thereof | |
| CN109658455B (en) | Image processing method and processing apparatus | |
| CN110458282A (en) | Multi-angle multi-mode fused image description generation method and system | |
| CN113435585B (en) | Service processing method, device and equipment | |
| CN110599452B (en) | Rust detection system, method, computer device and readable storage medium | |
| CN114332484B (en) | Key point detection method, device, computer equipment and storage medium | |
| Li et al. | FRD-CNN: Object detection based on small-scale convolutional neural networks and feature reuse | |
| CN111753859A (en) | Sample generation method, device and device | |
| CN110363830A (en) | Element image generation method, apparatus and system | |
| CN116958324A (en) | Training method, device, equipment and storage medium of image generation model | |
| CN115359219A (en) | Virtual image processing method and device of virtual world | |
| CN111753575A (en) | Text recognition method, device and device | |
| CN111325068A (en) | Video description method and device based on convolutional neural network | |
| CN116092154A (en) | A real-time small face detection method based on improved YOLOv5 | |
| KR102649947B1 (en) | System and method of understanding deep context using image and text deep learning | |
| CN111126358A (en) | Face detection method, device, storage medium and device | |
| Daryani et al. | IRL-Net: Inpainted region localization network via spatial attention | |
| KR20240108234A (en) | Electronic device for identifying object from video and operation method thereof | |
| US20250166125A1 (en) | Method of generating image and electronic device for performing the same | |
| Mahajan et al. | Forensic face sketch artist system | |
| CN116630480B (en) | A method, device and electronic device for interactive text-driven image editing | |
| CN115345917B (en) | Multi-stage dense reconstruction method and device with low video memory occupation | |
| CN118552761A (en) | Training method, device, equipment and medium of draft graph model | |
| CN114155417B (en) | Image target identification method and device, electronic equipment and computer storage medium | |
| CN114092733B (en) | Image classification method and device based on single positive image, storage medium and computer |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |