[go: up one dir, main page]

CN117036427A - Industrial printed matter image registration method and device based on lightweight network - Google Patents

Industrial printed matter image registration method and device based on lightweight network Download PDF

Info

Publication number
CN117036427A
CN117036427A CN202311008463.8A CN202311008463A CN117036427A CN 117036427 A CN117036427 A CN 117036427A CN 202311008463 A CN202311008463 A CN 202311008463A CN 117036427 A CN117036427 A CN 117036427A
Authority
CN
China
Prior art keywords
feature
channel
network
image
lightweight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311008463.8A
Other languages
Chinese (zh)
Inventor
宋晴
马原东
姜顺平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Songze Technology Co ltd
Original Assignee
Jiangsu Songze Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Songze Technology Co ltd filed Critical Jiangsu Songze Technology Co ltd
Priority to CN202311008463.8A priority Critical patent/CN117036427A/en
Publication of CN117036427A publication Critical patent/CN117036427A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an industrial printed matter image registration method based on a lightweight network, which comprises the following steps: s1, image acquisition and data set construction, namely acquiring the front side and the back side of a printed matter image through a CCD industrial camera, and constructing a training data set by combining a multispectral image; s2, constructing a DHSiam network, wherein two parallel feature extraction networks are covered; s3, providing an attention isomerism mechanism, covering a position regression unit and a light weight unit, enhancing data expression and reducing model parameters; s4, constructing a cross entropy loss function with additional margin, and enhancing network measurement performance; according to the invention, 8 GPU parallel hardware platforms are built, the AMC-Loss resolution image similarity is only utilized, the attention heterogeneous mechanism is utilized to extract important information of different scale feature images, meanwhile, model parameters are adjusted, the performance of an algorithm at a mobile end is enhanced, and the registration capability of the model on different visual angles and different deformation printed matter images is comprehensively improved.

Description

一种基于轻量级网络的工业印刷品图像配准方法与装置A lightweight network-based image registration method and device for industrial printed matter

技术领域Technical field

本发明涉及目标图像配准技术领域,具体为一种基于轻量级网络的工业印刷品图像配准方法与装置。The present invention relates to the technical field of target image registration, specifically a lightweight network-based industrial printed matter image registration method and device.

背景技术Background technique

在许多计算机视觉任务中,如何从大量数据集中寻找匹配图像起决定性作用,如CT图检测、多视图重建、图像检索和基于图像的定位等。图像配准作为一种比较热门的技术通过特定方式来获取指定的图像信息,在不同环境下(光照度、拍摄角度和拍摄位置等)判定图像对相似程度,实现几何对齐。目前,图像配准可分为传统配准方法和基于深度学习的图像配准方法。传统配准方法有传统刚性配准方法和传统非刚性配准算法两种,其中传统刚性配准算法,例如SIFT特征、AKAZE特征等根据源图间特征的映射关系利用仿射变换实现配准。图像源之间的特征相对稳定,该类配准算法具有一定的尺度和旋转不变性。但由于图像低级特征和人工设计的特征数量有限,很难准确表征图像间的真实形变,无法在源图出现复杂形变的情况下使用;传统非刚性配准算法,例如基于光流场的DEMONS配准算法,通过迭代运算,对源图间的位移场进行优化直至算法收敛,该算法能够获取良好的配准精度,但算法收敛速度慢,配准耗时长,无法满足工业实时性检测的要求In many computer vision tasks, how to find matching images from a large number of data sets plays a decisive role, such as CT image detection, multi-view reconstruction, image retrieval, and image-based localization. As a popular technology, image registration obtains specified image information in a specific way, determines the similarity of image pairs under different environments (illumination, shooting angle, shooting position, etc.), and achieves geometric alignment. Currently, image registration can be divided into traditional registration methods and deep learning-based image registration methods. Traditional registration methods include traditional rigid registration methods and traditional non-rigid registration algorithms. Traditional rigid registration algorithms, such as SIFT features and AKAZE features, use affine transformation to achieve registration based on the mapping relationship between features between source images. The features between image sources are relatively stable, and this type of registration algorithm has certain scale and rotation invariance. However, due to the limited number of low-level image features and manually designed features, it is difficult to accurately represent the true deformation between images, and cannot be used when complex deformations occur in the source image; traditional non-rigid registration algorithms, such as DEMONS registration based on optical flow field A quasi-algorithm, through iterative operations, optimizes the displacement field between source images until the algorithm converges. This algorithm can obtain good registration accuracy, but the algorithm converges slowly and registration takes a long time, which cannot meet the requirements of industrial real-time detection.

伴随着视觉成像技术的飞速发展,数据驱动的深度学习被引入实际生活,并且成功应用在各类视觉领域,如基于像素的高分辨率分类、图像分割、高层语义特征提取、变化检测等。深度学习在特征区域配准领域应用同样广泛,如:MatchNet网络通过全连接层融合图像区域特征实现相似性度量;DeepCompare方法实现对灰度图的相似性度量;L2-Net网络结构实现配准。With the rapid development of visual imaging technology, data-driven deep learning has been introduced into real life and has been successfully applied in various visual fields, such as pixel-based high-resolution classification, image segmentation, high-level semantic feature extraction, change detection, etc. Deep learning is also widely used in the field of feature region registration. For example, the MatchNet network fuses image region features through a fully connected layer to achieve similarity measurement; the DeepCompare method implements similarity measurement for grayscale images; the L2-Net network structure implements registration.

近些年,基于深度学习的图像配准算法模型越来越大和复杂,严重制约了其在工业领域的应用。在消费电子行业,说明书和包装盒等印刷品在出厂前要经过严格的筛查,确保外观没有缺陷。纸质印刷品缺陷检测的一般流程是用待测图像与模板图像进行比对,找出其中的差异,进而判断产品是否存在缺陷。差分处理是一种快速提取两张图像差异的方法,差分图中保留的信息仅是差分伪影和缺陷。但纸质印刷品是一种非刚性材料,在外力作用下极容易产生各种形变,即便通过夹具进行固定,也难以获得与模板图像完全相同的形态。传统配准方法难以有效解决纸质印刷品图像非刚性配准问题,无法从根源上减少缺陷检测产生的误判。并其像配准算法多采用欧式距离损失进行特征学习,该类损失引入更多输出特征的噪音信息而实际配准信息占比较低,严重制约配准效率,配准代价较大,整体模型收敛速度较慢精确率较低;对于现有的图像配准系统,多判定给定的补丁对是否配准,对尺寸更大、分辨率更高的图像集配准精度明显不足,实际应用中计算负担较大,移植性不足。In recent years, image registration algorithm models based on deep learning have become larger and more complex, severely restricting their application in the industrial field. In the consumer electronics industry, printed matter such as instructions and packaging boxes must undergo strict screening before leaving the factory to ensure that there are no defects in appearance. The general process of detecting defects in paper printed matter is to compare the image to be tested with the template image to find the differences and then determine whether the product has defects. Difference processing is a method to quickly extract the difference between two images. The only information retained in the difference map is the difference artifacts and defects. However, paper printed matter is a non-rigid material and is easily deformed by external forces. Even if it is fixed by a clamp, it is difficult to obtain the exact same shape as the template image. Traditional registration methods are difficult to effectively solve the problem of non-rigid registration of paper printed images, and cannot fundamentally reduce misjudgments caused by defect detection. And most of its image registration algorithms use Euclidean distance loss for feature learning. This type of loss introduces more noise information of the output features and the actual registration information accounts for a low proportion, seriously restricting the registration efficiency, causing a high cost of registration, and the overall model convergence. The speed is slow and the accuracy is low; for the existing image registration system, it is necessary to determine whether a given patch pair is registered. The registration accuracy for image sets with larger size and higher resolution is obviously insufficient, and the computational burden is heavy in practical applications. Larger and less transplantable.

发明内容Contents of the invention

本发明的目的在于提供一种基于轻量级网络的工业印刷品图像配准方法与装置,本发明搭建8片GPU并行的硬件平台,优化cuDNN库函数。从深度卷积神经网络在大规模图像分类、特征提取任务的成功应用中得到启发,将传统配准方法和当前卷积神经网络架构融合,设计一种采用GPU技术的配准方法,并结合大裕度交叉熵损失函数(Large MarginCosine Loss,LMCL),提出一种附加角度裕度的交叉熵损失(Additive Margin CrossEntropy Loss,AMC-Loss),仅利用特征向量的模值实现配准属性判定,用以克服或者缓解现有技术中上述缺陷。The purpose of the present invention is to provide a lightweight network-based industrial printed matter image registration method and device. The present invention builds an 8-GPU parallel hardware platform and optimizes the cuDNN library function. Inspired by the successful application of deep convolutional neural networks in large-scale image classification and feature extraction tasks, we integrate traditional registration methods with current convolutional neural network architectures and design a registration method using GPU technology, combined with large-scale The Margin Cross Entropy Loss function (Large MarginCosine Loss, LMCL) proposes an additional angular margin cross entropy loss (Additive Margin CrossEntropy Loss, AMC-Loss), which only uses the modulus of the feature vector to determine the registration attributes, using To overcome or alleviate the above-mentioned defects in the prior art.

2.为了解决上述技术问题,本发明提供如下技术方案:一种基于轻量级网络的工业印刷品图像配准方法,其特征在于,包括以下具体步骤:2. In order to solve the above technical problems, the present invention provides the following technical solution: a lightweight network-based industrial print image registration method, which is characterized by including the following specific steps:

S1:图像采集及数据集构建,采用CCD工业相机对生产线上的印刷品进行正、反面的图像采集,并结合多光谱图像构建训练数据集;S1: Image collection and data set construction. Use CCD industrial cameras to collect images of the front and back sides of printed matter on the production line, and combine multi-spectral images to build a training data set;

S2:网络构建与优化,依托Pseudo-Siamese搭建DHSiam网络,涵盖两个并行的特征提取网络;S2: Network construction and optimization, relying on Pseudo-Siamese to build a DHSiam network, covering two parallel feature extraction networks;

S3:注意力异构机制,涵盖两个模块,分别是位置回归单元与轻量化单元;S3: Attention heterogeneous mechanism, covering two modules, namely position regression unit and lightweight unit;

S4:AMC-Loss损失构建,构建附加裕度的交叉熵损失函数,涵盖交叉熵损失设计与Margin边缘值优化。S4: AMC-Loss loss construction, constructing a cross-entropy loss function with additional margin, covering cross-entropy loss design and Margin edge value optimization.

根据上述技术方案,损失函数中边缘值的大小通过参数值定量调整,进一步设计了具体的m定量调整损失方法。According to the above technical solution, the size of the edge value in the loss function is quantitatively adjusted through parameter values, and a specific m quantitative adjustment loss method is further designed.

注意力异构机制,镶嵌于DHSiam孪生网络各个残差模块架构中,定量调整特征向量的位置信息。The attention heterogeneous mechanism is embedded in the residual module architecture of each DHSiam twin network to quantitatively adjust the position information of the feature vector.

根据上述技术方案,步骤S2中提出一种DHSiam孪生网络模型;According to the above technical solution, a DHSiam twin network model is proposed in step S2;

DHSiam孪生网络模型基于并行的权重共享网络结构,由两个不同的分支构成,每个分支分别接对应深层神经网络,原图分支接ResNet50结构,待配准分支接ResNet101结构;The DHSiam twin network model is based on a parallel weight-sharing network structure and consists of two different branches. Each branch is connected to the corresponding deep neural network. The original image branch is connected to the ResNet50 structure, and the branch to be registered is connected to the ResNet101 structure;

将原始图像Ix通过ResNet50分支,待配准图像Iy通过ResNet101分支,输出对应7×7的特征图F(Ix)、F(Iy);Pass the original image Ix through the ResNet50 branch, and the image Iy to be registered through the ResNet101 branch, and output the corresponding 7×7 feature maps F(Ix) and F(Iy);

每个分支结构包含卷积层、非线性激活单元及归一化层,在训练期间图像对Ix和Iy被输入到不同分支中,输入图像对的最佳特征表示,并以特征向量的形式输出;Each branch structure includes a convolutional layer, a nonlinear activation unit and a normalization layer. During training, the image pairs Ix and Iy are input to different branches. The best feature representation of the image pair is input and output in the form of a feature vector. ;

采用全局深度卷积作为度量层,输出对应的1维的特征向量G(Ix)、G(Iy)。Global depth convolution is used as the measurement layer to output the corresponding 1-dimensional feature vectors G(Ix) and G(Iy).

根据上述技术方案,步骤S2中将传统残差网络输入Conv1层的卷积核替换为多个3×3的串行卷积组合。According to the above technical solution, in step S2, the convolution kernel of the traditional residual network input Conv1 layer is replaced with multiple 3×3 serial convolution combinations.

根据上述技术方案,步骤S4中构建附加裕度的交叉熵损失函数实现图像对属性判定;According to the above technical solution, in step S4, a cross-entropy loss function with additional margin is constructed to realize image pair attribute determination;

损失函数保留扩大类间差异的优势,并且降低对不同信号的敏感性,更注重向量模值的差异;The loss function retains the advantage of expanding the differences between classes, and reduces the sensitivity to different signals, paying more attention to the difference in vector module values;

模值设计的损失函数如下:The loss function of modular value design is as follows:

损失计算过程中,需保证仅获取特征向量的模值,因此采用L2算法将单个特征权重归一化,L为二进制标签,用于判定输入Ix、Iy图像对是正样本(l=1)还是负样本(l=0);During the loss calculation process, it is necessary to ensure that only the modulus of the feature vector is obtained, so the L2 algorithm is used to normalize the weight of a single feature. L is a binary label, which is used to determine whether the input Ix, Iy image pair is a positive sample (l=1) or a negative sample. sample(l=0);

根据上述技术方案,步骤S4中引入Margin信息,增强DHSiam网络特征向量间的可区分性;According to the above technical solution, Margin information is introduced in step S4 to enhance the distinguishability between feature vectors of the DHSiam network;

在原始图像和待配准图像输出特征向量Cx、Cy之间增加一个附加的边缘惩罚m,同时增强了类内紧度和类间差异;An additional edge penalty m is added between the original image and the output feature vectors C x and C y of the image to be registered, which simultaneously enhances the intra-class tightness and inter-class differences;

附加裕度的交叉熵损失函数定义如下:The cross-entropy loss function with additional margin is defined as follows:

式中,N是训练样本编号,pj是孪生网络输出的第j个特征向量的模值标定s=1;损失采用加性裕度添加方式削弱边际惩罚。In the formula, N is the training sample number, p j is the modulus calibration s=1 of the j-th feature vector output by the twin network; the loss uses additive margin addition to weaken the marginal penalty.

根据上述技术方案,步骤S2中添加注意力异构机制;According to the above technical solution, an attention heterogeneous mechanism is added in step S2;

注意力异构机制中首先通过位置回归单元解决特征旋转问题,在此算法基础上添加异构的注意力机制以建模通道的重要程度,减少模型参数量并输出精炼特征图。In the attention heterogeneous mechanism, the feature rotation problem is first solved through the position regression unit. Based on this algorithm, a heterogeneous attention mechanism is added to model the importance of the channel, reduce the amount of model parameters, and output refined feature maps.

根据上述技术方案,步骤S3中位置回归单元首先镶嵌于ResNet残差块中,沿着通道轴合并生成Rh×w×2的有效的特征描述符,之后接级联卷积,利用卷积层生成单张图片的空间映射区域 According to the above technical solution, in step S3, the position regression unit is first embedded in the ResNet residual block, merged along the channel axis to generate an effective feature descriptor of R h×w×2 , and then connected to cascade convolution, using the convolution layer Generate spatial mapping area of a single image

位置回归单元利用两个通道池化操作来聚合一个特征通道的信道信息,生成两个二维映射和/>分别表示通道层面的最大融合特征和平均融合特征,之后与一个标准卷积层连接生成/>公式如下:The position regression unit uses two channel pooling operations to aggregate the channel information of a feature channel and generate two two-dimensional mappings and/> Represents the maximum fusion feature and average fusion feature at the channel level respectively, and is then connected to a standard convolution layer to generate/> The formula is as follows:

式中,表示单个批次M中输入的第i张图片特征值/>与第i-1张图片的特征值夹角,即i位置特征矩阵旋转角度,通过旋转层输出角度校正后的特征信息,并为每个旋转区域补Padding避免特征丢失。In the formula, Represents the feature value of the i-th image input in a single batch M/> and the eigenvalues of the i-1th picture The included angle is the rotation angle of the i-position feature matrix. The angle-corrected feature information is output through the rotation layer, and padding is added for each rotated area to avoid feature loss.

根据上述技术方案,步骤S3中轻量化单元由特征分割模块和特征映射模块组成,引入一个新的超参数K,它规定了特征通道的拆分数,将所获取的超参数称之为基数组(本文中K=32),在此基础上,在轻量化中引入一个全新的超参数r,该参数限定了Unit中通道组的拆分数量,因此通道数的总数为c=Kr,输入将输入块c,沿通道维度分为Fi={F1,F2,...Fc}其中i∈{1,2,...c},之后接全局最大池化操作来聚合特征通道信息,生成特征描述符Ur(F)∈RH×W×c/N/r,其中r∈1,2,...N,其中N分组后的每组特征向量通道数,H和W为该通道向量的长和宽,此时经过全局池化后H=W,通过跨通道维度的全局平均池化,可以收集具有嵌入式通道统计信息的全局上下文信息sr∈Rc/K/r,公式如下According to the above technical solution, the lightweight unit in step S3 consists of a feature segmentation module and a feature mapping module. A new hyperparameter K is introduced, which specifies the number of splits of feature channels. The obtained hyperparameter is called a base array. (K=32 in this article). On this basis, a new hyperparameter r is introduced in lightweighting. This parameter limits the number of splits of channel groups in Unit. Therefore, the total number of channels is c=Kr, and the input will be The input block c is divided into F i = {F 1 , F 2 ,...F c } along the channel dimension, where i∈{1,2,...c}, followed by a global maximum pooling operation to aggregate feature channels Information, generate feature descriptor U r (F)∈R H×W×c/N/r , where r∈1,2,...N, where the number of feature vector channels in each group after N grouping, H and W are the length and width of the channel vector. At this time, H=W after global pooling. Through global average pooling across channel dimensions, global context information s r ∈R c/K/ with embedded channel statistics can be collected. r , the formula is as follows

其中每个特征映射通道都是通过分割的加权组合产生,第c通道的计算公式为:Each feature map channel is generated by a weighted combination of segmentations. The calculation formula for the c-th channel is:

式中aj(c)表示软通道分配权重:where a j (c) represents the soft channel allocation weight:

其中映射为根据上下文信息表示的sr通道位置的权重比;which maps is the weight ratio of the s r channel position represented by context information;

并接动态卷积核,其中包含不同的卷积核尺寸,采用随机减小卷积核尺寸的方式会造成部分重要特征提取不充分或特征丢失;通过r-Softmax引入软通道分配权重a,根据各个通道的权重因子,调整各个通道卷积和大小,具体公式如下:Dynamic convolution kernels are connected in parallel, which contain different convolution kernel sizes. Randomly reducing the convolution kernel size will cause insufficient extraction of some important features or loss of features; introduce soft channel allocation weight a through r-Softmax, according to The weight factor of each channel adjusts the convolution and size of each channel. The specific formula is as follows:

其中s为根据实际检测需求给定的阈值参数,即每隔一个K×K卷积核中间增添s-1个1×1卷积核,根据公式获取的各个通道的权重,将权重占比较小通道上的卷积核的大小从3×3减小到1×1,通过对重要通道保持3×3卷积核。Among them, s is the threshold parameter given according to the actual detection requirements, that is, s-1 1×1 convolution kernel is added to every other K×K convolution kernel. According to the weight of each channel obtained by the formula, the weight proportion is smaller. The size of the convolution kernel on the channel is reduced from 3×3 to 1×1, by maintaining the 3×3 convolution kernel for important channels.

根据上述技术方案,步骤S3中位置回归单元与轻量化单元串行排列,并根据样本集情况调整整个模型中两种机制的位置和数量。According to the above technical solution, the position regression unit and the lightweight unit are arranged in series in step S3, and the positions and quantities of the two mechanisms in the entire model are adjusted according to the sample set.

与现有技术相比,本发明所达到的有益效果是:本发明利用交叉熵损失将欧几里得空间转换成模空间,提高配准精确率,匹配代价直接由嵌入空间中两向量的模值计算。其中边缘值的大小可以通过参数m来定量调整,本发明进一步推导了具体的m来定量调整损失方法。此外,在DHSiam架构中提出注意力异构机制,调整特征向量的位置信息,降低由于图像旋转造成的误匹配,减少模型参数量。大量图像匹配分析中交叉熵损失和注意力异构机制都展现出明显优势,实验结果验证了方法的可移植性、鲁棒性和有效性。Compared with the existing technology, the beneficial effects achieved by the present invention are: the present invention uses cross entropy loss to convert the Euclidean space into a modulus space, improves the registration accuracy, and the matching cost is directly determined by the modulus of the two vectors in the embedded space. value calculation. The size of the edge value can be quantitatively adjusted through the parameter m. The present invention further derives a specific m to quantitatively adjust the loss method. In addition, an attention heterogeneous mechanism is proposed in the DHSiam architecture to adjust the position information of feature vectors, reduce mismatching caused by image rotation, and reduce the amount of model parameters. Cross-entropy loss and attention heterogeneous mechanisms have shown obvious advantages in a large number of image matching analyses, and experimental results have verified the portability, robustness, and effectiveness of the method.

附图说明Description of the drawings

附图用来提供对本发明的进一步理解,并且构成说明书的一部分,与本发明的实施例一起用于解释本发明,并不构成对本发明的限制。在附图中:The drawings are used to provide a further understanding of the present invention and constitute a part of the specification. They are used to explain the present invention together with the embodiments of the present invention and do not constitute a limitation of the present invention. In the attached picture:

图1是本发明基于轻量级网络的工业印刷品图像配准方法的流程图;Figure 1 is a flow chart of the lightweight network-based industrial print image registration method of the present invention;

图2是本发明实施例中图像采集部件的结构示意图;Figure 2 is a schematic structural diagram of the image acquisition component in the embodiment of the present invention;

图3是本发明实施例中工业相机单独采集装置的结构示意图;Figure 3 is a schematic structural diagram of a separate collection device for an industrial camera in an embodiment of the present invention;

图4是本发明中深度学习子网络的示意图;Figure 4 is a schematic diagram of the deep learning sub-network in the present invention;

图5是本发明中注意力异构机制位置回归单元网络的示意图;Figure 5 is a schematic diagram of the position regression unit network of the attention heterogeneous mechanism in the present invention;

图6是本发明中注意力异构机制轻量化单元网络的示意图;Figure 6 is a schematic diagram of the lightweight unit network of the attention heterogeneous mechanism in the present invention;

图7是本发明中轻量化单元的具体形式的示意图;Figure 7 is a schematic diagram of a specific form of the lightweight unit in the present invention;

图8是本发明中印刷品图像样本实例的示意图;Figure 8 is a schematic diagram of an example of a printed image sample in the present invention;

图中:1、工业相机;2、补光灯;3、位置传感器;4、凹槽传送带;5、电连接口;6、垂直滑轨;7、水平滑轨;8、印刷品卡板。In the picture: 1. Industrial camera; 2. Fill light; 3. Position sensor; 4. Grooved conveyor belt; 5. Electrical connection interface; 6. Vertical slide rail; 7. Horizontal slide rail; 8. Print card board.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of the present invention.

请参阅图1-8,本发明提供技术方案:一种基于深度学习的工业印刷品图像配准方法,包括:Please refer to Figures 1-8. The present invention provides a technical solution: a deep learning-based industrial print image registration method, including:

S1:图像采集及数据集构建,如图2所示,采用CCD工业相机对生产线上的印刷品进行正、反面的图像采集,并结合多光谱图像构建训练数据集。S1: Image collection and data set construction. As shown in Figure 2, a CCD industrial camera is used to collect images of the front and back sides of printed matter on the production line, and a training data set is constructed using multispectral images.

网络构建与优化,如图4所示,依托Pseudo-Siamese(非权重共享孪生网络)搭建DHSiam(Dynamic and Heterogeneous Siamese)网络,涵盖两条并行的特征提取网络(ResNet残差网络);Network construction and optimization, as shown in Figure 4, relies on Pseudo-Siamese (non-weight sharing twin network) to build a DHSiam (Dynamic and Heterogeneous Siamese) network, covering two parallel feature extraction networks (ResNet residual network);

S3:注意力异构机制,如图5、图6所示,涵盖位置回归单元与轻量化单元,增强数据表达,降低模型参数;S3: Attention heterogeneous mechanism, as shown in Figure 5 and Figure 6, covers position regression units and lightweight units, enhances data expression, and reduces model parameters;

S4:AMC-Loss损失构建,构建附加裕度的交叉熵损失函数(AMC-Loss),涵盖交叉熵损失设计与Margin边缘值优化,增强网络度量性能。S4: AMC-Loss loss construction, constructing an additional margin cross-entropy loss function (AMC-Loss), covering cross-entropy loss design and margin edge value optimization, and enhancing network measurement performance.

本发明实施例在配准过程中可有效提升系统的鲁棒性和精确度,并且在工业领域可移动设备端具有较强的适配性;本发明利用交叉熵损失将欧几里得空间转换成模空间,提高配准精确率,匹配代价直接由嵌入空间中两向量的模计算;The embodiments of the present invention can effectively improve the robustness and accuracy of the system during the registration process, and have strong adaptability to mobile devices in the industrial field; the present invention uses cross-entropy loss to transform the Euclidean space Modular space is formed to improve the registration accuracy, and the matching cost is directly calculated by the modulus of the two vectors in the embedding space;

本发明实施例损失函数中边缘值的大小可以通过参数m来定量调整,本发明进一步设计了具体的m来定量调整损失方法。The size of the edge value in the loss function in the embodiment of the present invention can be quantitatively adjusted through the parameter m. The present invention further designs a specific m to quantitatively adjust the loss method.

本发明实施例注意力异构机制,镶嵌于DHSiam网络各个残差模块架构,定量调整特征向量的位置信息,减少由于图像旋转造成的误匹配,解决由于训练模型过大造成的算法移植不足问题。The attention heterogeneous mechanism of the embodiment of the present invention is embedded in each residual module structure of the DHSiam network to quantitatively adjust the position information of the feature vector, reduce mismatching caused by image rotation, and solve the problem of insufficient algorithm transplantation due to an oversized training model.

在本发明实施例中,配准系统通过大量实验验证了方法的可移植性、鲁棒性和有效性。In the embodiment of the present invention, the registration system has verified the portability, robustness and effectiveness of the method through a large number of experiments.

在本发明实施例中,配准系统应用环境不仅局限于工业印刷品图像,在医疗CT图像、遥感卫星图像、三维建模、工业缺陷检测等领域同样适用。In the embodiment of the present invention, the application environment of the registration system is not limited to industrial printed images, but is also applicable to medical CT images, remote sensing satellite images, three-dimensional modeling, industrial defect detection and other fields.

在本发明实施例中,印刷品图像采集装置如图2所示,装置包括凹槽传送带部件和位于凹槽两侧的图像采集部件,图像采集部件包括工业相机1、补光灯2、位置传感器3、垂直滑轨6和水平滑轨7,工业相机1位置可调;凹槽传送带部件包括凹槽传送带4和电连接口5;图像采集装置与凹槽传送带4距离可调,补光灯2安置于工业相机1镜头外侧,工业相机1与移动端计算平台相连;In the embodiment of the present invention, the print image acquisition device is shown in Figure 2. The device includes a groove conveyor belt component and image acquisition components located on both sides of the groove. The image acquisition component includes an industrial camera 1, a fill light 2, and a position sensor 3. , vertical slide rail 6 and horizontal slide rail 7, the position of the industrial camera 1 is adjustable; the groove conveyor belt component includes the groove conveyor belt 4 and the electrical connection port 5; the distance between the image acquisition device and the groove conveyor belt 4 is adjustable, and the fill light 2 is placed Outside the lens of the industrial camera 1, the industrial camera 1 is connected to the mobile computing platform;

具体的,补光灯2安置于工业相机1镜头外侧位置,保证采集的待测液晶显示屏图像清晰不受外部光线影响;Specifically, the fill light 2 is placed outside the lens of the industrial camera 1 to ensure that the collected image of the LCD screen to be tested is clear and not affected by external light;

具体的,图像采集部件包括垂直滑轨6和水平滑轨7,工业相机1安装在垂直滑轨6的滑块上;垂直滑轨6与水平滑轨7连接;水平滑轨7与工控机电装置连接;Specifically, the image acquisition component includes a vertical slide rail 6 and a horizontal slide rail 7. The industrial camera 1 is installed on the slider of the vertical slide rail 6; the vertical slide rail 6 is connected to the horizontal slide rail 7; the horizontal slide rail 7 is connected to the industrial control electromechanical device. connect;

在本发明实施例中,垂直滑轨6的滑槽上装有位置传感器3,位置传感器3与工控机电装置连接,根据位置传感器3检测信息调整工业相机1与凹槽传送带4的距离,根据位置传感器3检测信息调整工业相机1高度;In the embodiment of the present invention, a position sensor 3 is installed on the chute of the vertical slide rail 6. The position sensor 3 is connected to the industrial control electromechanical device. The distance between the industrial camera 1 and the groove conveyor belt 4 is adjusted according to the detection information of the position sensor 3. According to the position sensor 3. Detection information adjusts the height of industrial camera 1;

在本发明实施例中,工业相机单独采集装置如图3所示,其中包含:工业相机1,补光灯2,印刷品卡板8。具体的,印刷品卡板8可添加模拟背景,工业印刷品粘贴于印刷品卡板8上可实现正、反面采集。In the embodiment of the present invention, an industrial camera separate collection device is shown in Figure 3, which includes: an industrial camera 1, a fill light 2, and a printed matter card board 8. Specifically, a simulated background can be added to the printed matter card board 8, and industrial printed matter can be pasted on the printed matter card board 8 to achieve front and back collection.

在本发明实施例中,图3印刷品卡板8可垂直放置于图2凹槽传送带4中,实现正、反面各视角同时采集。具体的印刷品卡板8可模拟各种采集环境,添加采集背景。In the embodiment of the present invention, the printed matter card board 8 in Figure 3 can be placed vertically in the grooved conveyor belt 4 in Figure 2 to achieve simultaneous collection of front and back viewing angles. The specific printed matter card board 8 can simulate various collection environments and add collection backgrounds.

DHSiam基于对的非权重共享网络结构,由两个不同的分支构成,每个分支分别接对应深层神经网络,原图分支接ResNet50结构,待配准分支接ResNet101结构,以保证充分提取图像信息;DHSiam's pair-based non-weight sharing network structure consists of two different branches, each branch is connected to the corresponding deep neural network, the original image branch is connected to the ResNet50 structure, and the branch to be registered is connected to the ResNet101 structure to ensure full extraction of image information;

网络将原始图像Ix通过ResNet50分支,待配准图像Iy通过ResNet101分支,输出对应特征图F(Ix)、F(Iy);The network passes the original image Ix through the ResNet50 branch, and the image Iy to be registered passes through the ResNet101 branch, and outputs the corresponding feature maps F(Ix) and F(Iy);

在本发明实施例中,网络每个分支结构包含卷积层、非线性激活单元及归一化层。在训练期间图像对Ix和Iy被输入到不同分支中,输入图像对的最佳特征表示,并以特征向量的形式输出。In the embodiment of the present invention, each branch structure of the network includes a convolution layer, a nonlinear activation unit and a normalization layer. During training, image pairs Ix and Iy are input into different branches, and the best feature representation of the input image pair is output in the form of a feature vector.

进一步的,配准网络包含度量层,实现特征降维,网络采用全局深度卷积(GlobalDeeplearning Convolutional,GDC)作为度量层,输出对应的1×1的特征向量G(Ix)、G(Iy);Furthermore, the registration network includes a measurement layer to achieve feature dimensionality reduction. The network uses Global Deep Learning Convolutional (GDC) as the measurement layer to output the corresponding 1×1 feature vectors G(Ix) and G(Iy);

在本发明实施例中,传统残差网络输入Conv1层的卷积核替换为多个3×3的串行卷积组合,避免较大卷积核带来更多参数量,降低cuDNN库函数计算量。In the embodiment of the present invention, the convolution kernel of the traditional residual network input Conv1 layer is replaced by multiple 3×3 serial convolution combinations to avoid larger convolution kernels bringing more parameters and reduce the calculation of cuDNN library functions. quantity.

步骤S4中为保证系统的鲁棒性及稳定性,构建附加裕度的交叉熵损失函数(Additive Margin Cross Entropy Loss,AMC-Loss)实现图像对属性判定;In step S4, in order to ensure the robustness and stability of the system, an additional margin cross entropy loss function (Additive Margin Cross Entropy Loss, AMC-Loss) is constructed to implement image pair attribute determination;

损失函数保留扩大类间差异的优势,更注重向量模值的差异。The loss function retains the advantage of expanding the differences between classes and pays more attention to the difference in vector module values.

本发明根据模值设计损失函数如下:The present invention designs the loss function based on the modulus value as follows:

损失计算过程中,需保证仅获取特征向量的模值,因此本发明采用L2算法将单个特征权重归一化。l为二进制标签,用于判定输入Ix、Iy图像对是正还是负样本;During the loss calculation process, it is necessary to ensure that only the modulus value of the feature vector is obtained, so the present invention uses the L2 algorithm to normalize the weight of a single feature. l is a binary label, used to determine whether the input Ix, Iy image pair is a positive or negative sample;

本发明根据余弦相似性设计损失函数如下:The present invention designs the loss function based on cosine similarity as follows:

本发明在原始图像和待配准图像输出特征向量Cx、Cy之间增加一个附加的边缘惩罚m,同时增强了类内紧度和类间差异;This invention adds an additional edge penalty m between the original image and the output feature vectors C x and C y of the image to be registered, while enhancing the intra-class tightness and inter-class differences;

附加裕度的交叉熵损失函数(Additive Margin Cross Entropy Loss,AMC-Loss)定义如下:The additional margin cross entropy loss function (Additive Margin Cross Entropy Loss, AMC-Loss) is defined as follows:

式中,N是训练样本编号,pj是孪生网络输出的第j个特征向量的模值标定s=1。损失采用加性裕度添加方式削弱边际惩罚,避免乘法边际惩罚和复双角公式造成模型训练难度加大的问题。In the formula, N is the training sample number, p j is the module value calibration s=1 of the j-th feature vector output by the twin network. The loss uses additive margin addition to weaken the marginal penalty and avoid the problem of increased difficulty in model training caused by the multiplicative marginal penalty and the complex double-angle formula.

配准系统设计过程中,待配准图像的旋转给配准带来了很大的不确定性,如何有效避免图像旋转对特征提取的影响决定了算法性能。During the design process of the registration system, the rotation of the image to be registered brings great uncertainty to the registration. How to effectively avoid the impact of image rotation on feature extraction determines the algorithm performance.

本发明实施例将在DHSiam网络基础上添加注意力异构机制,该机制包含两个子单元:位置回归单元和轻量化单元。首先通过位置回归单元解决特征旋转问题,在此基础上,添加异构的轻量化单元,降低模型参数量,增强其在移动端的部署。The embodiment of the present invention will add an attention heterogeneous mechanism based on the DHSiam network. This mechanism includes two sub-units: a position regression unit and a lightweight unit. First, the feature rotation problem is solved through the position regression unit. On this basis, heterogeneous lightweight units are added to reduce the number of model parameters and enhance its deployment on the mobile terminal.

在本发明实施例中,给定一组图片,配准图片与待配准图片在比例、角度和视野上都是不同的。由于卷积采用局部感受野导致对应的特征可能存在一些角度的差异,这些差异将引入类内不一致性,影响配准精度。为了解决这个问题,本发明提出位置回归单元,在图像间探索、学习角度信息,从而避免图像旋转造成的误配准。该方法能够自适应的聚合特征上下文信息,提高图像配准的特征表达能力。In the embodiment of the present invention, given a set of pictures, the registration picture and the picture to be registered are different in proportion, angle and field of view. Since convolution uses local receptive fields, the corresponding features may have some angular differences. These differences will introduce intra-class inconsistencies and affect the registration accuracy. In order to solve this problem, the present invention proposes a position regression unit to explore and learn angle information between images, thereby avoiding misregistration caused by image rotation. This method can adaptively aggregate feature context information and improve the feature expression ability of image registration.

为避免图像旋转造成的配准差异,获取特征的角度信息是关键。如图5所示,设计位置回归单元,将全局上下文信息映射到由扩展残差网络生成的特征上,旋转对应角度从而获得更好的特征表达。为获取角度信息,本发明首先在ResNet残差块的基础上,沿着通道轴添加池化操作,并将其合并生成Rh×w×2的有效的特征描述符,沿通道轴做池化操作可有效突出位置信息。之后接级联卷积,利用卷积层生成单张图片的空间映射区域最后通过归一化操作和内积操作获取旋转角度。To avoid registration differences caused by image rotation, obtaining the angular information of features is key. As shown in Figure 5, the position regression unit is designed to map the global context information to the features generated by the extended residual network, and rotate the corresponding angle to obtain better feature expression. In order to obtain angle information, the present invention first adds a pooling operation along the channel axis based on the ResNet residual block, merges them to generate an effective feature descriptor of R h×w×2 , and performs pooling along the channel axis. The operation can effectively highlight location information. Then connect the cascade convolution, and use the convolution layer to generate the spatial mapping area of a single picture. Finally, the rotation angle is obtained through normalization operation and inner product operation.

本发明实施例利用两个通道池化操作来聚合一个特征通道的信道信息,生成两个二维映射和/>分别表示通道层面的最大融合特征和平均融合特征。然后与一个标准卷积层连接生成/>该层采用5×5卷积核,更大的感受野获取更多空间信息。之后进行归一化和内积操作,公式如下:The embodiment of the present invention uses two channel pooling operations to aggregate the channel information of a feature channel and generate two two-dimensional mappings. and/> represent the maximum fusion feature and the average fusion feature at the channel level respectively. Then concatenated with a standard convolutional layer to generate/> This layer uses a 5×5 convolution kernel to obtain more spatial information with a larger receptive field. Then perform normalization and inner product operations, the formula is as follows:

式中,表示单个批次M中输入的第i张图片特征值/>与第i-1张图片的特征值夹角,即i位置特征矩阵旋转角度,之后通过旋转层输出角度校正后的特征信息,并为每个旋转区域补Padding避免旋转造成特征丢失。In the formula, Represents the feature value of the i-th image input in a single batch M/> and the eigenvalues of the i-1th picture The included angle is the rotation angle of the i-position feature matrix, and then the angle-corrected feature information is output through the rotation layer, and padding is added for each rotated area to avoid feature loss caused by rotation.

本发明实施例轻量化单元是一个特征计算单元,由特征分割模块和特征映射模块组成。图6,图7描述了轻量化单元的具体形式。The lightweight unit in the embodiment of the present invention is a feature calculation unit, which is composed of a feature segmentation module and a feature mapping module. Figure 6 and Figure 7 describe the specific form of the lightweight unit.

图6中,步骤S3中轻量化单元由特征分割模块和特征映射模块组成,本发明引入一个新的超参数K,它规定了特征通道的拆分数。本发明将所获取的超参数称之为基数组(本发明中K=32)。In Figure 6, the lightweight unit in step S3 consists of a feature segmentation module and a feature mapping module. The present invention introduces a new hyperparameter K, which specifies the number of splits of feature channels. In this invention, the obtained hyperparameters are called base arrays (K=32 in this invention).

进一步的量化中本发明引入一个全新的超参数r,该参数限定了Unit中通道组的拆分数量,因此通道数的总数为c=Kr。输入将输入块c,沿通道维度分为Fi={F1,F2,...Fc},其中,i∈{1,2,...c},之后接全局最大池化操作来聚合特征通道信息,生成特征描述符Ur(F)∈RH×W×c/N/r其中r∈1,2,...N其中N分组后的每组特征向量通道数,H和W为该通道向量的长和宽,此时经过全局池化后H=W,通过跨通道维度的全局平均池化,可以收集具有嵌入式通道统计信息的全局上下文信息sr∈Rc/K/r,公式如下:In further quantification, the present invention introduces a new hyperparameter r, which limits the number of splits of channel groups in Unit, so the total number of channels is c=Kr. The input block c is divided into F i = {F 1 , F 2 ,...F c } along the channel dimension, where i∈{1,2,...c}, followed by a global maximum pooling operation. To aggregate the feature channel information, generate the feature descriptor U r (F)∈R H×W×c/N/r where r∈1,2,...N where the number of feature vector channels in each group after N grouping, H and W are the length and width of the channel vector. At this time, H=W after global pooling. Through global average pooling across channel dimensions, global context information with embedded channel statistics s r ∈ R c/ can be collected. K/r , the formula is as follows:

其中每个特征映射通道都是通过分割的加权组合产生。第c通道的计算公式为:Each feature map channel is generated by a weighted combination of segmentations. The calculation formula for the c-th channel is:

式中aj(c)表示软通道分配权重:where a j (c) represents the soft channel allocation weight:

其中映射为根据上下文信息表示的sr通道位置的权重比。which maps is the weight ratio of the s r channel position represented by context information.

图7中,接动态卷积核,这其中包含不同的卷积核尺寸,但采用随机减小卷积核尺寸的方式会造成部分重要特征提取不充分或特征丢失,为了解决该类问题,本发明通过r-Softmax引入软通道分配权重a,根据各个通道的权重因子,调整各个通道卷积和大小。具体公式如下:In Figure 7, a dynamic convolution kernel is connected, which contains different convolution kernel sizes. However, randomly reducing the convolution kernel size will cause insufficient extraction of some important features or loss of features. In order to solve this type of problem, this paper The invention introduces soft channel allocation weight a through r-Softmax, and adjusts the convolution sum size of each channel according to the weight factor of each channel. The specific formula is as follows:

其中s为根据实际检测需求给定的阈值参数(即每隔一个K×K卷积核中间增添s-1个1×1卷积核)。根据获取的各个通道的权重,将权重占比较小通道上的卷积核的大小从3×3减小到1×1,通过对重要通道保持3×3卷积核,本发明确保了卷积核提取到特征信息足够充分。Among them, s is the threshold parameter given according to the actual detection requirements (that is, adding s-1 1×1 convolution kernels to every other K×K convolution kernel). According to the obtained weight of each channel, the size of the convolution kernel on the channel with a smaller weight is reduced from 3×3 to 1×1. By maintaining the 3×3 convolution kernel for the important channels, the present invention ensures the convolution The feature information extracted by the kernel is sufficient.

与传统同构卷积相比,本发明的轻量化模块替换了部分较大的卷积核,采用这样的方式降低了模型参数量,并且本发明的模型采用注意力机制动态调整,采用较大卷积核提取重要信息,采用较小卷积核提取非重要信息。让整个模型在降低参数量的基础上,保障提取精度不变。Compared with traditional isomorphic convolution, the lightweight module of the present invention replaces some of the larger convolution kernels, reducing the amount of model parameters in this way. Moreover, the model of the present invention uses the attention mechanism to dynamically adjust, using larger convolution kernels. The convolution kernel extracts important information, and a smaller convolution kernel is used to extract non-important information. Let the entire model ensure that the extraction accuracy remains unchanged on the basis of reducing the number of parameters.

在本发明实施例中,为充分利用上下文信息,将位置回归单元与轻量化单元聚合。具体的,本发明通过池化层对特征信息进行变换,并通过不同形式的池化层累加获取像素或通道的不同权值以建模特征的重要程度,最后通过不同的残差块输出最终预测图。通过实验可以发现两种机制串行排列比并行排列效果更好,具体模型中会调整整个模型中两种机制的位置和数量,以提高GPU加速库cuDNN的计算效率。In the embodiment of the present invention, in order to make full use of context information, the position regression unit and the lightweight unit are aggregated. Specifically, the present invention transforms feature information through pooling layers, and accumulates different weights of pixels or channels through different forms of pooling layers to model the importance of features, and finally outputs the final prediction through different residual blocks. picture. Through experiments, it can be found that the serial arrangement of the two mechanisms is better than the parallel arrangement. In the specific model, the position and quantity of the two mechanisms in the entire model will be adjusted to improve the computing efficiency of the GPU acceleration library cuDNN.

为保证库函数输出与输入对应,需确保相邻库函数数据结构、参数量完全对应相等。这与卷积神经网络上下层的传输类似,本发明实施例采用C++面向对象的方式将库函数书写为抽象的类,方便开发者调用。In order to ensure that the output of the library function corresponds to the input, it is necessary to ensure that the data structure and parameter amounts of adjacent library functions are completely corresponding and equal. This is similar to the transmission of the upper and lower layers of a convolutional neural network. The embodiment of the present invention uses the C++ object-oriented method to write the library functions as abstract classes to facilitate developers to call them.

在本发明实施例中,采用深度学习算法的视觉应用是一种以图像集为支撑的方法,一个完整高效的工业印刷品数据集在深度学习任务中有着举足轻重的地位,而现有的公共数据集在规模、多样性、适用性方面有明显缺陷。本发明的训练程序需要由图像对和几何关系组成的监督网络,过程中通常需要大量的、具有丰富代表性的数据集。为了编制一个强大的数据集,需借助各种各样的成像设备。因此,在本发明的实验中设计两部分数据集:In the embodiment of the present invention, the visual application using deep learning algorithms is a method supported by image sets. A complete and efficient industrial printed matter data set plays a decisive role in the deep learning task, and the existing public data sets There are obvious flaws in scale, diversity, and applicability. The training program of the present invention requires a supervised network composed of image pairs and geometric relationships, and the process usually requires a large number of richly representative data sets. To compile a robust data set, a wide variety of imaging equipment is required. Therefore, two parts of the data set are designed in the experiment of the present invention:

第一部分使用不同地标印刷品图像集3000对(包括勃长城、故宫、颐和园、布达拉宫、白金汉宫、泰姬陵等),并将印刷品镶嵌至图2所示的采集装置中;The first part uses a set of 3,000 pairs of printed images of different landmarks (including the Great Wall, the Forbidden City, the Summer Palace, the Potala Palace, Buckingham Palace, the Taj Mahal, etc.) and inlays the printed matter into the collection device shown in Figure 2;

为了保证样本的丰富性,从网上收集部分印刷品图书集与公共印刷品图书集共同组成的印刷品图像样本集10万对。In order to ensure the richness of the samples, 100,000 pairs of printed image samples were collected from the Internet, consisting of some printed book collections and public printed book collections.

在本发明实施例图像预处理是必需的阶段,使用数据增强技术对训练图像进行扩展,方法的概要如图8所示,其中左端为原始图像,右端为经过Unity位姿变换后的图像。将对齐后的图像进行裁剪去掉无用区域,采用直方图均衡化、Z分数归一化和下采样等方法对图像数据进行标准化处理。训练阶段,采用随机旋转和水平翻转来增加数据集。利用扩展后的训练数据对CNN进行训练,保证CNN模型的最佳性能。In the stage where image preprocessing is necessary in the embodiment of the present invention, the training image is expanded using data enhancement technology. The outline of the method is shown in Figure 8, where the left end is the original image and the right end is the image after Unity pose transformation. The aligned images are cropped to remove useless areas, and the image data is standardized using methods such as histogram equalization, Z-score normalization, and downsampling. During the training phase, random rotation and horizontal flipping are used to increase the data set. Use the expanded training data to train the CNN to ensure the best performance of the CNN model.

本发明实施例选取的负样本对约为正样本对的三到四倍,采用这种样本生成方式,能够保证负样本丰富的代表性,尽量涵盖系统实际配准过程中出现的所有情况。之后对数据集做不同程度的数据增强,丰富的训练数据使模型训练结果更高效,获取的模型鲁棒性更强。The negative sample pairs selected in the embodiment of the present invention are about three to four times that of the positive sample pairs. Using this sample generation method can ensure the rich representativeness of the negative samples and try to cover all situations that occur in the actual registration process of the system. Afterwards, different degrees of data enhancement are performed on the data set. The rich training data makes the model training results more efficient and the obtained model more robust.

应当理解的是,本说明书未详细阐述的部分均属于现有技术。It should be understood that parts not elaborated in this specification belong to the prior art.

应当理解的是,本发明的应用范围适用但不限于工业印刷品图像处理领域。上述针对较佳实施例的描述较为详细,并不能因此而认为是对本发明专利保护范围的限制。It should be understood that the application scope of the present invention is applicable to, but not limited to, the field of industrial print image processing. The above description of the preferred embodiments is relatively detailed and should not be considered as limiting the scope of patent protection of the present invention.

需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。It should be noted that in this article, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply that these entities or operations are mutually exclusive. any such actual relationship or sequence exists between them. Furthermore, the terms "comprises," "comprises," or any other variations thereof are intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus that includes a list of elements includes not only those elements, but also those not expressly listed other elements, or elements inherent to the process, method, article or equipment.

最后应说明的是:以上仅为本发明的优选实施例而已,并不用于限制本发明,尽管参照前述实施例对本发明进行了详细的说明,对于本领域的技术人员来说,其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。Finally, it should be noted that the above are only preferred embodiments of the present invention and are not intended to limit the present invention. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art can still The technical solutions described in the foregoing embodiments may be modified, or some of the technical features may be equivalently replaced. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the present invention shall be included in the protection scope of the present invention.

Claims (10)

1.一种基于轻量级网络的工业印刷品图像配准方法,其特征在于,包括以下具体步骤:1. A lightweight network-based industrial print image registration method, which is characterized by including the following specific steps: S1:图像采集及数据集构建,采用CCD工业相机对生产线上的印刷品进行正、反面的图像采集,并结合多光谱图像构建训练数据集;S1: Image collection and data set construction. Use CCD industrial cameras to collect images of the front and back sides of printed matter on the production line, and combine multi-spectral images to build a training data set; S2:网络构建与优化,依托Pseudo-Siamese搭建DHSiam网络,涵盖两个并行的特征提取网络;S2: Network construction and optimization, relying on Pseudo-Siamese to build a DHSiam network, covering two parallel feature extraction networks; S3:注意力异构机制,涵盖两个模块,分别是位置回归单元与轻量化单元;S3: Attention heterogeneous mechanism, covering two modules, namely position regression unit and lightweight unit; S4:AMC-Loss损失构建,构建附加裕度的交叉熵损失函数,涵盖交叉熵损失设计与Margin边缘值优化。S4: AMC-Loss loss construction, constructing a cross-entropy loss function with additional margin, covering cross-entropy loss design and Margin edge value optimization. 2.根据权利要求1所述的一种基于轻量级网络的工业印刷品图像配准方法,其特征在于,损失函数中边缘值的大小通过参数值定量调整,进一步设计了具体的m定量调整损失方法。2. A lightweight network-based industrial print image registration method according to claim 1, characterized in that the size of the edge value in the loss function is quantitatively adjusted through parameter values, and a specific m quantitative adjustment loss is further designed. method. 注意力异构机制,镶嵌于DHSiam孪生网络各个残差模块架构中,定量调整特征向量的位置信息。The attention heterogeneous mechanism is embedded in the residual module architecture of each DHSiam twin network to quantitatively adjust the position information of the feature vector. 3.根据权利要求1所述的一种基于轻量级网络的工业印刷品图像配准方法,其特征在于,步骤S2中提出一种DHSiam孪生网络模型;3. A lightweight network-based industrial print image registration method according to claim 1, characterized in that a DHSiam twin network model is proposed in step S2; DHSiam孪生网络模型基于并行的权重共享网络结构,由两个不同的分支构成,每个分支分别接对应深层神经网络,原图分支接ResNet50结构,待配准分支接ResNet101结构;The DHSiam twin network model is based on a parallel weight-sharing network structure and consists of two different branches. Each branch is connected to the corresponding deep neural network. The original image branch is connected to the ResNet50 structure, and the branch to be registered is connected to the ResNet101 structure; 将原始图像Ix通过ResNet50分支,待配准图像Iy通过ResNet101分支,输出对应7×7的特征图F(Ix)、F(Iy);Pass the original image Ix through the ResNet50 branch, and the image Iy to be registered through the ResNet101 branch, and output the corresponding 7×7 feature maps F(Ix) and F(Iy); 每个分支结构包含卷积层、非线性激活单元及归一化层,在训练期间图像对Ix和Iy被输入到不同分支中,输入图像对的最佳特征表示,并以特征向量的形式输出;Each branch structure includes a convolutional layer, a nonlinear activation unit and a normalization layer. During training, the image pairs Ix and Iy are input to different branches. The best feature representation of the image pair is input and output in the form of a feature vector. ; 采用全局深度卷积作为度量层,输出对应的1维的特征向量G(Ix)、G(Iy)。Global depth convolution is used as the measurement layer to output the corresponding 1-dimensional feature vectors G(Ix) and G(Iy). 4.根据权利要求3所述的一种基于轻量级网络的工业印刷品图像配准方法,其特征在于,步骤S2中将传统残差网络输入Conv1层的卷积核替换为多个3×3的串行卷积组合。4. A lightweight network-based industrial print image registration method according to claim 3, characterized in that in step S2, the convolution kernel of the traditional residual network input Conv1 layer is replaced with multiple 3×3 serial convolution combination. 5.根据权利要求1所述的一种基于轻量级网络的工业印刷品图像配准方法,其特征在于,步骤S4中构建附加裕度的交叉熵损失函数实现图像对属性判定;5. A lightweight network-based industrial print image registration method according to claim 1, characterized in that in step S4, a cross-entropy loss function with additional margin is constructed to realize image pair attribute determination; 损失函数保留扩大类间差异的优势,并且降低对不同信号的敏感性,更注重向量模值的差异;The loss function retains the advantage of expanding the differences between classes, and reduces the sensitivity to different signals, paying more attention to the difference in vector module values; 模值设计的损失函数如下:The loss function of modular value design is as follows: 损失计算过程中,需保证仅获取特征向量的模值,因此采用L2算法将单个特征权重归一化,L为二进制标签,用于判定输入Ix、Iy图像对是正样本(l=1)还是负样本(l=0)。During the loss calculation process, it is necessary to ensure that only the modulus value of the feature vector is obtained, so the L2 algorithm is used to normalize the weight of a single feature. L is a binary label, which is used to determine whether the input Ix, Iy image pair is a positive sample (l=1) or a negative sample. Sample(l=0). 6.根据权利要求5所述的一种基于轻量级网络的工业印刷品图像配准方法,其特征在于,步骤S4中引入Margin信息,增强DHSiam网络特征向量间的可区分性;6. A lightweight network-based industrial print image registration method according to claim 5, characterized in that Margin information is introduced in step S4 to enhance the distinguishability between DHSiam network feature vectors; 在原始图像和待配准图像输出特征向量Cx、Cy之间增加一个附加的边缘惩罚m,同时增强了类内紧度和类间差异;An additional edge penalty m is added between the original image and the output feature vectors C x and C y of the image to be registered, which simultaneously enhances the intra-class tightness and inter-class differences; 附加裕度的交叉熵损失函数定义如下:The cross-entropy loss function with additional margin is defined as follows: 式中,N是训练样本编号,pj是孪生网络输出的第j个特征向量的模值标定s=1;损失采用加性裕度添加方式削弱边际惩罚。In the formula, N is the training sample number, p j is the modulus calibration s=1 of the j-th feature vector output by the twin network; the loss uses additive margin addition to weaken the marginal penalty. 7.根据权利要求1一种基于轻量级网络的工业印刷品图像配准方法,其特征在于,步骤S2中添加注意力异构机制;7. A lightweight network-based industrial print image registration method according to claim 1, characterized in that an attention heterogeneous mechanism is added in step S2; 注意力异构机制中首先通过位置回归单元解决特征旋转问题,在此算法基础上添加异构的注意力机制以建模通道的重要程度,减少模型参数量并输出精炼特征图。In the attention heterogeneous mechanism, the feature rotation problem is first solved through the position regression unit. Based on this algorithm, a heterogeneous attention mechanism is added to model the importance of the channel, reduce the amount of model parameters, and output refined feature maps. 8.根据权利要求7所述的一种基于轻量级网络的工业印刷品图像配准方法,其特征在于,步骤S3中位置回归单元首先镶嵌于ResNet残差块中,沿着通道轴合并生成Rh×w×2的有效的特征描述符,之后接级联卷积,利用卷积层生成单张图片的空间映射区域 8. A lightweight network-based industrial print image registration method according to claim 7, characterized in that in step S3, the position regression unit is first embedded in the ResNet residual block and merged along the channel axis to generate R Effective feature descriptor of h×w×2 , followed by cascade convolution, using the convolution layer to generate the spatial mapping area of a single image 位置回归单元利用两个通道池化操作来聚合一个特征通道的信道信息,生成两个二维映射和/>分别表示通道层面的最大融合特征和平均融合特征,之后与一个标准卷积层连接生成/>公式如下:The position regression unit uses two channel pooling operations to aggregate the channel information of a feature channel and generate two two-dimensional mappings and/> Represents the maximum fusion feature and average fusion feature at the channel level respectively, and is then connected to a standard convolution layer to generate/> The formula is as follows: 式中,表示单个批次M中输入的第i张图片特征值/>与第i-1张图片的特征值/>夹角,即i位置特征矩阵旋转角度,通过旋转层输出角度校正后的特征信息,并为每个旋转区域补Padding避免特征丢失。In the formula, Represents the feature value of the i-th image input in a single batch M/> and the eigenvalues of the i-1th picture/> The included angle is the rotation angle of the i-position feature matrix. The angle-corrected feature information is output through the rotation layer, and padding is added for each rotated area to avoid feature loss. 9.根据权利要求7所述的一种基于轻量级网络的工业印刷品图像配准方法,其特征在于,步骤S3中轻量化单元由特征分割模块和特征映射模块组成,引入一个新的超参数K,它规定了特征通道的拆分数,将所获取的超参数称之为基数组(本文中K=32),在此基础上,在轻量化中引入一个全新的超参数r,该参数限定了Unit中通道组的拆分数量,因此通道数的总数为c=Kr,输入将输入块c,沿通道维度分为Fi={F1,F2,...Fc}其中i∈{1,2,...c},之后接全局最大池化操作来聚合特征通道信息,生成特征描述符Ur(F)∈RH×W×c/N/r,其中r∈1,2,...N,其中N分组后的每组特征向量通道数,H和W为该通道向量的长和宽,此时经过全局池化后H=W,通过跨通道维度的全局平均池化,可以收集具有嵌入式通道统计信息的全局上下文信息sr∈Rc/K/r,公式如下9. A lightweight network-based industrial print image registration method according to claim 7, characterized in that in step S3, the lightweight unit is composed of a feature segmentation module and a feature mapping module, and a new hyperparameter is introduced K, which stipulates the number of splits of feature channels, and calls the obtained hyperparameters the base array (K=32 in this article). On this basis, a new hyperparameter r is introduced in lightweighting. This parameter The number of splits of channel groups in Unit is limited, so the total number of channels is c=Kr. The input will be input block c, which is divided into F i = {F 1 , F 2 ,...F c } along the channel dimension, where i ∈{1,2,...c}, followed by a global maximum pooling operation to aggregate feature channel information and generate feature descriptor U r (F)∈R H×W×c/N/r , where r∈1 ,2,...N, where the number of channels of each group of feature vectors after N grouping, H and W are the length and width of the channel vector. At this time, after global pooling, H=W, through the global average across channel dimensions Pooling, global context information s r ∈R c/K/r with embedded channel statistics can be collected, the formula is as follows 其中每个特征映射通道都是通过分割的加权组合产生,第c通道的计算公式为:Each feature map channel is generated by a weighted combination of segmentations. The calculation formula for the c-th channel is: 式中aj(c)表示软通道分配权重:where a j (c) represents the soft channel allocation weight: 其中映射为根据上下文信息表示的sr通道位置的权重比;which maps is the weight ratio of the s r channel position represented by context information; 并接动态卷积核,其中包含不同的卷积核尺寸,采用随机减小卷积核尺寸的方式会造成部分重要特征提取不充分或特征丢失;通过r-Softmax引入软通道分配权重a,根据各个通道的权重因子,调整各个通道卷积和大小,具体公式如下:Dynamic convolution kernels are connected in parallel, which contain different convolution kernel sizes. Randomly reducing the convolution kernel size will cause insufficient extraction of some important features or loss of features; introduce soft channel allocation weight a through r-Softmax, according to The weight factor of each channel adjusts the convolution and size of each channel. The specific formula is as follows: 其中s为根据实际检测需求给定的阈值参数,即每隔一个K×K卷积核中间增添s-1个1×1卷积核,根据公式获取的各个通道的权重,将权重占比较小通道上的卷积核的大小从3×3减小到1×1,通过对重要通道保持3×3卷积核。Among them, s is the threshold parameter given according to the actual detection requirements, that is, s-1 1×1 convolution kernel is added to every other K×K convolution kernel. According to the weight of each channel obtained by the formula, the weight proportion is smaller. The size of the convolution kernel on the channel is reduced from 3×3 to 1×1, by maintaining the 3×3 convolution kernel for important channels. 10.根据权利要求7所述的一种基于轻量级网络的工业印刷品图像配准方法,其特征在于,步骤S3中位置回归单元与轻量化单元串行排列,并根据样本集情况调整整个模型中两种机制的位置和数量。10. A lightweight network-based industrial print image registration method according to claim 7, characterized in that in step S3, the position regression unit and the lightweight unit are arranged in series, and the entire model is adjusted according to the sample set. The location and quantity of the two mechanisms in .
CN202311008463.8A 2023-08-11 2023-08-11 Industrial printed matter image registration method and device based on lightweight network Pending CN117036427A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311008463.8A CN117036427A (en) 2023-08-11 2023-08-11 Industrial printed matter image registration method and device based on lightweight network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311008463.8A CN117036427A (en) 2023-08-11 2023-08-11 Industrial printed matter image registration method and device based on lightweight network

Publications (1)

Publication Number Publication Date
CN117036427A true CN117036427A (en) 2023-11-10

Family

ID=88634889

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311008463.8A Pending CN117036427A (en) 2023-08-11 2023-08-11 Industrial printed matter image registration method and device based on lightweight network

Country Status (1)

Country Link
CN (1) CN117036427A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190128978A (en) * 2018-05-09 2019-11-19 한국과학기술원 Method for estimating human emotions using deep psychological affect network and system therefor
CN111460931A (en) * 2020-03-17 2020-07-28 华南理工大学 Face spoofing detection method and system based on color channel difference map features
CN113128341A (en) * 2021-03-18 2021-07-16 杭州电子科技大学 Dog face identification method based on convolutional neural network
WO2023273290A1 (en) * 2021-06-29 2023-01-05 山东建筑大学 Object image re-identification method based on multi-feature information capture and correlation analysis
CN116306813A (en) * 2023-03-07 2023-06-23 西安电子科技大学 Method based on YOLOX light weight and network optimization

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190128978A (en) * 2018-05-09 2019-11-19 한국과학기술원 Method for estimating human emotions using deep psychological affect network and system therefor
CN111460931A (en) * 2020-03-17 2020-07-28 华南理工大学 Face spoofing detection method and system based on color channel difference map features
CN113128341A (en) * 2021-03-18 2021-07-16 杭州电子科技大学 Dog face identification method based on convolutional neural network
WO2023273290A1 (en) * 2021-06-29 2023-01-05 山东建筑大学 Object image re-identification method based on multi-feature information capture and correlation analysis
CN116306813A (en) * 2023-03-07 2023-06-23 西安电子科技大学 Method based on YOLOX light weight and network optimization

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
MA, YD 等: "Additive margin cosine loss for image registration", 《VISUAL COMPUTER》, 31 May 2022 (2022-05-31) *
徐云飞;张笃周;王立;华宝成;: "非合作目标局部特征识别轻量化特征融合网络设计", 红外与激光工程, no. 07, 25 July 2020 (2020-07-25) *
戴志豪: "基于图卷积神经网络的太极拳动作识别与评价", 《硕士电子期刊出版信息》, 15 January 2023 (2023-01-15) *
毛雪宇;彭艳兵;: "增量角度域损失和多特征融合的地标识别", 中国图象图形学报, no. 08, 12 August 2020 (2020-08-12) *

Similar Documents

Publication Publication Date Title
CN113160062B (en) An infrared image target detection method, device, equipment and storage medium
Liu et al. Sift flow: Dense correspondence across different scenes
CN108427924A (en) A kind of text recurrence detection method based on rotational sensitive feature
CN113159043A (en) Feature point matching method and system based on semantic information
CN110399888B (en) Weiqi judging system based on MLP neural network and computer vision
CN113095371B (en) A feature point matching method and system for 3D reconstruction
CN116777905B (en) Intelligent industrial rotation detection method and system based on long tail distribution data
Zheng et al. Feature enhancement for multi-scale object detection
CN110910497B (en) Method and system for realizing augmented reality map
CN118628905A (en) A multi-scale feature enhanced deep learning SAR ship detection method
CN110458128A (en) A method, device, device and storage medium for acquiring attitude feature
CN117392508A (en) A target detection method and device based on coordinate attention mechanism
CN119672340A (en) Semantic detail fusion and context enhancement remote sensing image segmentation method based on DeepLabv3+
CN118675022A (en) Multi-mode ship target association method based on multi-feature fusion
CN104463962B (en) Three-dimensional scene reconstruction method based on GPS information video
CN113159158A (en) License plate correction and reconstruction method and system based on generation countermeasure network
CN112668662A (en) Outdoor mountain forest environment target detection method based on improved YOLOv3 network
CN117011535A (en) Feature point matching method and device, electronic equipment and storage medium
CN113256556B (en) Image selection method and device
CN112766057B (en) A fine-grained attribute-driven gait dataset synthesis method for complex scenes
CN114387489A (en) Power equipment identification method and device and terminal equipment
CN109284752A (en) A rapid detection method for vehicles
CN119648913A (en) A street scene reconstruction method, electronic device and storage medium
CN117036427A (en) Industrial printed matter image registration method and device based on lightweight network
CN118968014A (en) A multi-feature recognition method for infrared dim small targets based on residual learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination