CN112257796A - Image integration method of convolutional neural network based on selective characteristic connection - Google Patents
Image integration method of convolutional neural network based on selective characteristic connection Download PDFInfo
- Publication number
- CN112257796A CN112257796A CN202011174153.XA CN202011174153A CN112257796A CN 112257796 A CN112257796 A CN 112257796A CN 202011174153 A CN202011174153 A CN 202011174153A CN 112257796 A CN112257796 A CN 112257796A
- Authority
- CN
- China
- Prior art keywords
- feature
- neural network
- convolutional neural
- level
- characteristic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4084—Scaling of whole images or parts thereof, e.g. expanding or contracting in the transform domain, e.g. fast Fourier transform [FFT] domain scaling
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an image integration method of a convolutional neural network based on selective characteristic connection, which comprises the following steps: respectively solving the average characteristics of the low-layer characteristics and the high-layer characteristics; subtracting the average characteristic of the low-level characteristic from the average characteristic of the high-level characteristic to obtain the score of the key characteristic graph; scaling the average characteristic of the high-level characteristic; performing Softmax normalization processing to obtain a characteristic Z; and carrying out maximum value normalization processing on the characteristic Z to obtain the attention score. The image integration method of the convolutional neural network based on selective feature connection can better integrate feature map information based on a high-low layer feature fusion mode of selective feature connection, more effectively utilize learned features and does not increase the number of parameters. The structure of the convolutional neural network is optimized, the performance of the network is improved, the method is particularly significant to the shallow convolutional neural network, and the shallow convolutional neural network is applied to more fields.
Description
Technical Field
The invention belongs to the technical field of convolutional neural networks, and particularly relates to an image integration method of a convolutional neural network based on selective characteristic connection.
Background
In recent years, the research of network architecture has attracted much attention. Today, many excellent network architectures are proposed in succession. Google lenet constructs a 22-layer convolutional neural network, but it reduces the number of parameters from 6000 to 400 ten thousand by using the inclusion model. VGGNet demonstrates that increasing the depth of the network using a very small convolution filter can effectively boost the effectiveness of the model. However, increasing the depth of the network cannot simply stack the network layers one upon the other. Adding more layers in the appropriate depth model may result in higher training errors due to the problems of gradient disappearance and gradient explosion, which make deep networks difficult to train. High way Networks propose an efficient method of using bypass (bypass) and gate units (gating units) to train an end-to-end network with more than 100 layers. Bypass is considered a key factor in training these very deep networks. ResNet further attests to this view, it has added identity maps (identity maps) as a bypass in the network, and by using residual blocks (residual blocks), ResNet has made a breakthrough advance in many challenging tasks (image recognition, localization, and detection, etc.).
A novel visualization technique enables an in-depth understanding of the characteristics of the middle layer of the convolutional neural network and the operation of the classifier. In fact, the feature maps of different levels extract information of different levels of the input image. The lower layer features extract more detailed information, while the higher layer features extract more semantic information, the higher layer semantic information being closer to the last layer with class labels. In many computer vision tasks, combining high-level information and low-level information can effectively improve experimental performance.
At present, a Convolutional Neural Network (CNN) is used as an important branch of deep learning, a hardware basis required by the CNN as a main research direction is gradually matured, and as hardware technology is more and more perfect, deep learning algorithms are more and more diversified, bottom layer languages such as C language and C + + cannot meet a plurality of deep learning research requirements, and a plurality of more convenient and more flexible deep learning development frameworks such as tensflow, Caffe, thano, Keras, and torch are generated. The appearance of visualization technology can deeply analyze each layer of characteristics of the convolutional neural network, wherein the high-layer characteristics contain more semantic information, and the low-layer characteristics contain more detailed information, so that the integration of the high-layer information and the low-layer information to improve the experimental performance is an important research direction of the convolutional neural network in many computer vision tasks.
In a convolutional neural network, high-level and low-level feature fusion is an effective way for improving network performance, however, low-level features have the problems of background confusion and semantic ambiguity, and direct fusion of high-level and low-level features may cause confusion and semantic ambiguity of the fused features, resulting in poor network performance.
Disclosure of Invention
Based on the defects of the prior art, the technical problem to be solved by the invention is to provide an image integration method of a convolutional neural network based on selective feature connection, which is used for fusing a low-level feature with a high-level feature after the low-level feature is processed through a selective feature connection mechanism, so that the network performance is improved.
In order to solve the above technical problem, the present invention provides an image integration method based on a convolutional neural network with selective feature connection, which includes the following steps:
step 1: respectively solving the average characteristics of the low-layer characteristics and the high-layer characteristics;
step 2: subtracting the average characteristic of the low-level characteristic from the average characteristic of the high-level characteristic obtained in the step 1 to obtain a score of a key characteristic graph;
and step 3: scaling the average characteristic of the high-level characteristic;
and 4, step 4: respectively carrying out Softmax normalization processing on the scores of the key feature graphs obtained in the step 2 and the results of the scaling processing in the step 3 to obtain features Z;
and 5: and carrying out maximum value normalization processing on the characteristic Z to obtain the attention score.
Optionally, in step 1, the average characteristics of the low-level features are as follows:
the average features of the high-level features are as follows:
wherein Am ∈ RF×G×1,Bm∈RF×G×1The value of A0 at spatial location (i, j, c) corresponds to A0i,j,cAnd B corresponds to a value of B at spatial location (i, j, c)i,j,c,C1Number of features representing lower layers, C2Representing the number of high-level features.
Optionally, in step 2, the scores of the key feature maps are as follows:
P=Bm-Am。
further, in step 3, the scaling process is performed on the average feature of the high-level features as follows:
D=Bm*n
Further, in step 4, the scores of the key feature maps obtained in step 2 and the results of the scaling processing in step 3 are subjected to Softmax normalization processing, respectively, as follows:
the resulting characteristic Z is as follows:
Z=SP-SD。
therefore, the image integration method based on the convolutional neural network with the selective characteristic connection has the following beneficial effects:
the high-low layer feature fusion mode based on selective feature connection can better integrate feature map information, more effectively utilize learned features and can not increase the number of parameters. The structure of the convolutional neural network is optimized, the performance of the network is improved, the method is particularly significant to the shallow convolutional neural network, and the shallow convolutional neural network is applied to more fields.
The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical means of the present invention more clearly understood, the present invention may be implemented in accordance with the content of the description, and in order to make the above and other objects, features, and advantages of the present invention more clearly understood, the following detailed description is given in conjunction with the preferred embodiments, together with the accompanying drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings of the embodiments will be briefly described below.
Fig. 1 is a diagram of a CNN network architecture for selective feature connection;
FIG. 2 is a high-low level feature direct fusion map;
FIG. 3 is a diagram of high-level and low-level feature additive fusion;
FIG. 4 is a diagram of a process for selective feature computation.
Detailed Description
Other aspects, features and advantages of the present invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, which form a part of this specification, and which illustrate, by way of example, the principles of the invention. In the referenced drawings, the same or similar components in different drawings are denoted by the same reference numerals.
The invention applies a general network architecture Selective Feature Connection Mechanism (SFCM) to connect convolutional neural network features of different layers. Different layer features contain different information, higher layer features always contain more semantic information, and lower layer features contain more detail information, however, the lower layer features are affected by the background, which causes background confusion and semantic ambiguity. Combining the high-level and low-level features directly, which can cause background clutter and semantic ambiguity, SFCM effectively overcomes this drawback. It uses human visual recognition mechanisms whereby low-level features are selectively connected to high-level features through feature selectors generated from high-level features, which can be employed in many network architectures.
The classical convolutional neural network consists of an input layer, a convolutional layer, a pooling layer, a full-link layer and an output layer, wherein the extracted convolutional features are from a low layer to a high layer, and finally, the final output result is obtained from the high-layer features. In order to improve the performance of the neural network, the invention uses the residual error structure of the ResNet model for reference, and obtains the final output result after fusing the high-level and low-level characteristics through a selective characteristic connection mechanism, wherein the network structure diagram is shown in figure 1:
the selective characteristic operation process will be described in detail below.
The existing method for fusing features of high and low layers generally combines feature maps of different layers directly, and the combined features are shown in formula (1):
O=[A,B] (1)
whereinLow-level features representing convolutional neural networks, C1Representing the number of low-level features, and G and F represent the width and height of the feature map. And C2Represents the number of high-level features,represents a high-level feature of a convolutional neural network,representing a combination of features. The whole process is shown in fig. 2.
However, the combined features obtained by directly combining the feature maps sharply increase the parameters of the full-link layer, so the method of fusing the high-level features and the low-level features of the present invention is to add the high-level features and the low-level features, as shown in formula (2):
O=A1+B (2)
whereinRepresenting the low-level features of a convolutional neural network,representing the transformed features of the lower level features,represents a high-level feature of a convolutional neural network,representing the combined features, the process is shown in fig. 3.
However, directly connecting the lower and upper layer features does not fully exploit the lower and upper layer information complementary properties. The high-level features contain more semantic information and the low-level features contain more detailed information. Combining the high-level and low-level features directly may cause background clutter and semantic ambiguity due to the introduction of too much detailed information. The present invention proposes a Selective Feature Connection Mechanism (SFCM) by referring to the human visual recognition mechanism. An attention score is assigned to each element on the low-level feature map that represents the importance of the element on the low-level feature map.
First, average features Am and Bm of a low-level feature and a high-level feature are obtained, respectively, and the average feature of the low-level feature is shown in formula (3):
the average feature of the high-level features is shown in equation (4):
wherein Am ∈ RF×G×1,Bm∈RF×G×1The value of A0 at spatial location (i, j, c) corresponds to A0i,j,cAnd B corresponds to a value of B at spatial location (i, j, c)i,j,c。
Because the superficial network extracts texture and detail features, the deep network extracts outline, shape and strongest features, and the superficial network comprises more features and also extracts key features, however, the deeper the layer number is, the more representative the extracted features are, and the more prominent the key features are. Thus, the average feature of the lower layer is subtracted from the average feature of the upper layer to obtain the score P of the key feature map, as shown in equation (5):
P=Bm-Am (5)
the average feature Bm of the high-level features is scaled to obtain D, as shown in equation (6):
D=Bm*n (6)
Performing Softmax normalization on P and D respectively, as shown in formula (7) and formula (8):
thus, the characteristic Z can be obtained as shown in equation (9). It represents the degree of importance of the corresponding position of each element of the low-level features.
Z=SP-SD (9)
The attention score M can be obtained by performing maximum normalization processing on the feature Z, that is, the feature selector is obtained, as shown in formula (10):
wherein M is the same as RF×G×1,Mi,jIs the final score at position (i, j). The learned attention score represents the importance of the corresponding position of each element of the low-level features. Thus, multiplying the low-level features by the attention score may screen out important features of the low-level features. Thus, the new low-layer feature As can be obtained from equation (11):
the new low-level features are augmented to a1 for fusing the high-level features. And (4) fusing the high-low layer features, and calculating a fusion coefficient L. If the average score of each of the feature maps a1 and B is E and F, it is determined from equation (12) and equation (13).
From this, a fusion coefficient L can be calculated as shown in equation (14)
The final combined features are then as shown in equation (15):
O=L*A1+B (15)
the whole selective characteristic operation process is shown in fig. 4.
As can be seen from the feature selector M, the feature selector can enhance the salient region of the low-level feature and suppress the background region of the low-level feature. With SFCM, most pixels in the low-level feature map are suppressed. Therefore, on the premise of not damaging the semantic expression capability of the high-level features, more detailed information of the salient regions of the low-level feature map is added, the expression capability of the features is further enhanced, and better performance is obtained.
The method builds two convolutional neural networks to carry out image classification experiments on data sets cifar10 and cifar100, and firstly builds a 9-layer convolutional neural network model which comprises an input layer, 3 convolutional layers, 3 pooling layers, 1 full-connection layer (a feature extraction layer) and an output layer (a Softmax layer). And an 11-layer convolutional neural network model is also built and comprises an input layer, 5 convolutional layers, 3 pooling layers, 1 full-connection layer (feature extraction layer) and an output layer.
The results of the experiment are shown in Table 1
Table 1: image recognition rates on datasets cifar10 and cifar100
As can be seen from table 1, the direct fusion of the high-level features and the low-level features may cause a decrease in the image recognition rate of the neural network, and the convolutional neural network based on the selective feature connection mechanism may ensure an increase in the image recognition rate, and compared with the conventional convolutional neural network, the convolutional neural network based on the selective feature connection mechanism may improve the accuracy by 0.9% in the cifar10 data set and 1.4% in the cifar100 data set, which proves the effectiveness and superiority of the selective feature connection mechanism.
While the foregoing is directed to the preferred embodiment of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.
Claims (6)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202011174153.XA CN112257796B (en) | 2020-10-28 | 2020-10-28 | An image integration method based on convolutional neural network with selective feature connection |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202011174153.XA CN112257796B (en) | 2020-10-28 | 2020-10-28 | An image integration method based on convolutional neural network with selective feature connection |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN112257796A true CN112257796A (en) | 2021-01-22 |
| CN112257796B CN112257796B (en) | 2024-06-28 |
Family
ID=74261703
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202011174153.XA Expired - Fee Related CN112257796B (en) | 2020-10-28 | 2020-10-28 | An image integration method based on convolutional neural network with selective feature connection |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN112257796B (en) |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109902748A (en) * | 2019-03-04 | 2019-06-18 | 中国计量大学 | An Image Semantic Segmentation Method Based on Multi-layer Information Fusion Fully Convolutional Neural Network |
| WO2019144575A1 (en) * | 2018-01-24 | 2019-08-01 | 中山大学 | Fast pedestrian detection method and device |
| CN110084794A (en) * | 2019-04-22 | 2019-08-02 | 华南理工大学 | A kind of cutaneum carcinoma image identification method based on attention convolutional neural networks |
| CN110097145A (en) * | 2019-06-20 | 2019-08-06 | 江苏德劭信息科技有限公司 | One kind being based on CNN and the pyramidal traffic contraband recognition methods of feature |
| CN110728192A (en) * | 2019-09-16 | 2020-01-24 | 河海大学 | High-resolution remote sensing image classification method based on novel characteristic pyramid depth network |
| CN111553289A (en) * | 2020-04-29 | 2020-08-18 | 中国科学院空天信息创新研究院 | A kind of remote sensing image cloud detection method and system |
| CN111753752A (en) * | 2020-06-28 | 2020-10-09 | 重庆邮电大学 | Robot closed-loop detection method based on multi-layer feature fusion of convolutional neural network |
-
2020
- 2020-10-28 CN CN202011174153.XA patent/CN112257796B/en not_active Expired - Fee Related
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2019144575A1 (en) * | 2018-01-24 | 2019-08-01 | 中山大学 | Fast pedestrian detection method and device |
| CN109902748A (en) * | 2019-03-04 | 2019-06-18 | 中国计量大学 | An Image Semantic Segmentation Method Based on Multi-layer Information Fusion Fully Convolutional Neural Network |
| CN110084794A (en) * | 2019-04-22 | 2019-08-02 | 华南理工大学 | A kind of cutaneum carcinoma image identification method based on attention convolutional neural networks |
| CN110097145A (en) * | 2019-06-20 | 2019-08-06 | 江苏德劭信息科技有限公司 | One kind being based on CNN and the pyramidal traffic contraband recognition methods of feature |
| CN110728192A (en) * | 2019-09-16 | 2020-01-24 | 河海大学 | High-resolution remote sensing image classification method based on novel characteristic pyramid depth network |
| CN111553289A (en) * | 2020-04-29 | 2020-08-18 | 中国科学院空天信息创新研究院 | A kind of remote sensing image cloud detection method and system |
| CN111753752A (en) * | 2020-06-28 | 2020-10-09 | 重庆邮电大学 | Robot closed-loop detection method based on multi-layer feature fusion of convolutional neural network |
Non-Patent Citations (1)
| Title |
|---|
| CHEN DU 等: "Selective Feature Connection Mechanism: Concatenating Multi-layer CNN Features with a Feature Selector", ARXIV, pages 1 - 8 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN112257796B (en) | 2024-06-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111354017B (en) | Target tracking method based on twin neural network and parallel attention module | |
| CN110570458B (en) | Target tracking method based on internal cutting and multi-layer characteristic information fusion | |
| CN108229338B (en) | Video behavior identification method based on deep convolution characteristics | |
| CN109191491B (en) | Target tracking method and system of full convolution twin network based on multi-layer feature fusion | |
| CN112950477B (en) | A High Resolution Salient Object Detection Method Based on Dual Path Processing | |
| CN105701508B (en) | Global local optimum model and conspicuousness detection algorithm based on multistage convolutional neural networks | |
| CN115147456B (en) | A Target Tracking Method Based on Timing Adaptive Convolution and Attention Mechanism | |
| CN114219824B (en) | Visible light-infrared target tracking method and system based on deep network | |
| CN110533024B (en) | Double-quadratic pooling fine-grained image classification method based on multi-scale ROI (region of interest) features | |
| CN111507334B (en) | An instance segmentation method based on key points | |
| CN118429389B (en) | Target tracking method and system based on multiscale aggregation attention feature extraction network | |
| CN111967524A (en) | Multi-scale fusion feature enhancement algorithm based on Gaussian filter feedback and cavity convolution | |
| CN105069413A (en) | Human body gesture identification method based on depth convolution neural network | |
| CN115393596B (en) | Garment image segmentation method based on artificial intelligence | |
| CN107844795A (en) | Convolutional neural network feature extraction method based on principal component analysis | |
| CN112164077B (en) | Cell instance segmentation method based on bottom-up path enhancement | |
| CN116152926A (en) | Sign language identification method, device and system based on vision and skeleton information fusion | |
| CN112069943A (en) | Online multi-person posture estimation and tracking method based on top-down framework | |
| CN116129289A (en) | Attention edge interaction optical remote sensing image saliency target detection method | |
| CN107301644A (en) | Natural image non-formaldehyde finishing method based on average drifting and fuzzy clustering | |
| CN118230106A (en) | A weakly supervised salient object detection method based on enhanced graffiti annotations | |
| CN114693951A (en) | An RGB-D Saliency Object Detection Method Based on Global Context Information Exploration | |
| CN117409359A (en) | Fire detection method of dynamic multi-scale attention network | |
| CN112800958B (en) | Lightweight human body key point detection method based on heat point diagram | |
| CN114511895B (en) | Natural scene emotion recognition method based on attention mechanism multi-scale network |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CF01 | Termination of patent right due to non-payment of annual fee | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20240628 |