Disclosure of Invention
The application aims to overcome the defects in the prior art and provide a hyperspectral image classification method with strong feature extraction capability and full utilization of sample information.
In order to achieve the above purpose, the application is realized by adopting the following technical scheme:
in a first aspect, the present application provides a hyperspectral image classification method, comprising,
Acquiring a hyperspectral image, and extracting a first feature map from the hyperspectral image;
performing KAN convolution operation on the first feature map to capture spatial and spectral features;
Updating the first feature map by using an adaptive attention mechanism to amplify the spatial and spectral features in the first feature map to obtain a second feature map, wherein the adaptive attention mechanism calculates similarity between a query and a key based on hamming distance;
refocusing convolution is carried out on the second feature map, response intensity of local areas in the second feature map is adjusted, and feature expression in the local areas is amplified according to the response intensity of each local area, so that a third feature map is obtained;
And carrying out linear combination on the features in the third feature map, and identifying the features subjected to linear combination to obtain a classification result.
In some embodiments of the first aspect, the extracting a first feature map from the hyperspectral image includes scanning each local area in the hyperspectral image using a filter, extracting key feature locations; performing KAN convolution operation on the hyperspectral image aiming at the key characteristic position to obtain a first characteristic image;
The method comprises the steps of updating the first feature map by using an adaptive attention mechanism to amplify space and spectrum features in the first feature map to obtain a second feature map, performing binarization processing on the first feature map processed by the adaptive attention mechanism by using a pulse activation function, performing KAN convolution operation on the first feature map processed by the binarization processing to obtain a first feature map which highlights the space and spectrum features, performing second binarization processing on the first feature map which highlights the space and spectrum features by using the pulse activation function to inhibit noise in the first feature map, and reducing the dimension of the first feature map processed by the second binarization processing by using a multi-scale pooling operation, wherein the first feature map after dimension reduction is obtained by the following calculation:
,
In the formula, For the first feature map after the dimension reduction,AndThe coordinates are respectively given as the coordinates,For the purpose of pooling the window size,To pool the count sequence numbers of the traversal elements,The first characteristic diagram is obtained after the second binarization processing;
And obtaining a second characteristic diagram according to the first characteristic diagram after the dimension reduction.
In some embodiments of the first aspect, the refocusing convolution is performed on the second feature map, the response intensity of the local area in the second feature map is adjusted, and the feature expression in the local area is amplified according to the response intensity of each local area, to obtain a third feature map, including,
Performing repeated refocusing Jiao Juanji on the second feature map, and performing binarization processing on the second feature map by using a pulse activation function after performing refocusing convolution each time so as to amplify feature expression in a local area;
the refocusing convolution process is as follows:
Calculating a characteristic value after refocusing convolution by the following formula:
,
In the formula, The eigenvalues after the refocusing convolution,The weight matrix for the refocusing convolution kernel,As a second feature map of the input,AndThe coordinates are respectively given as the coordinates,Is an offset term used for adjusting the offset of the output in refocusing convolution operation;
Introducing a characteristic value after the characteristic value is dynamically adjusted by a learnable parameter and is obtained by calculating the following formula:
,
In the formula, Is the firstThe characteristic value of the bar channel after being dynamically adjusted,The channel index is represented as a function of the channel index,Is the firstA learnable parameter of the strip channel for controlling the firstThe intensity of the response of the strip channel,Is the first after refocusing convolutionCharacteristic values of the strip channels;
And (3) reorganizing according to the characteristic value of each channel after the dynamic adjustment to obtain a second characteristic diagram for completing single refocusing convolution.
In some embodiments of the first aspect,
Updating the second feature map with amplified feature expression by using an adaptive attention mechanism, wherein the adaptive attention mechanism calculates the similarity between the query and the key based on the hamming distance, and optimizes key features in the second feature map according to the similarity between the query and the key;
performing refocusing convolution update on the second feature map updated by the self-adaptive attention mechanism;
the dimension of the second feature map after refocusing convolution updating is reduced through multi-scale pooling operation, and the dimension-reduced second feature map is obtained through the following formula:
,
In the formula, In order to obtain the second characteristic diagram after dimension reduction,For the purpose of pooling the window size,To pool the count sequence numbers of the traversal elements,A second feature map updated by refocusing convolution;
and obtaining a third characteristic diagram according to the second characteristic diagram after dimension reduction.
In some embodiments of the first aspect, the linearly combining the features in the third feature map, identifying the linearly combined features to obtain the classification result includes,
The linearly combined features are obtained according to the following formula:
,
In the formula, For the weight matrix of the full connection layer in the adaptive attention mechanism,As a result of the third characteristic diagram,As a result of the bias term,Is a linearly combined feature.
In some embodiments of the first aspect, the KAN convolution operation has the following formula:
,
In the formula, AndThe coordinates are respectively given as the coordinates,In order to be in the spectral dimension,For the first feature map after KAN convolution operation,A weight matrix that is a KAN convolution kernel,AndThe coordinates of the convolution kernels respectively,Representing the sliding operation of the KAN convolution kernel,For the spectral dimension of the convolution kernel,For coordinates and spectral dimensions ofPixel values at;
the first convolution kernel of the KAN convolution operation employs a gaussian radial basis function convolution kernel.
In some embodiments of the first aspect, the processing formula for binarizing the first feature map using a pulse activation function is as follows:
,
In the formula, AndThe coordinates are respectively given as the coordinates,In order to be in the spectral dimension,Is the first characteristic diagram after binarization processing,Representation ofThe latter inequality is the execution condition,In order to activate the threshold value,Is a first feature map.
In some embodiments of the first aspect, the updating the first profile using an adaptive attention mechanism includes,
Performing linear transformation on the first feature map to obtain a query vector, a key vector and a value vector;
calculating the similarity between the query vector and the key vector by the hamming distance, wherein the similarity is obtained by the following formula:
,
In the formula, For the calculation operation of the hamming distance,In order to query the vector of the vector,In the form of a key vector,In order to query the count number of the vector,For the count number of the key vector,Representing query vectorsAnd key vectorThe similarity between the two is set to be similar,Is the firstThe first of the query vectorsThe number of elements to be added to the composition,Is the firstThe first key vectorThe number of elements to be added to the composition,Representing the exclusive or operation of binary vectors, representing that two elements are different when the calculation result is 1, representing that two elements are identical when the calculation result is 0,As the length of the element of the vector,A count sequence number for the element;
Based on the similarity between the query vector and the key vector, generating an attention weight, wherein the attention weight is obtained through the following calculation:
,
In the formula, The count number for the similarity is the number of counts,Is the firstSum of query vectorsAttention weights of the individual key vectors;
generating a new feature vector according to the attention weight and the value vector, and obtaining an updated first feature map according to the new feature vector, wherein the feature vector is obtained through the following calculation:
,
In the formula, The feature vector after the update is shown,The count number of the representative value vector,Is a vector of and keyA corresponding value vector.
In a second aspect, the present application also provides a computer device comprising a processor and a memory connected to the processor, in which memory a computer program is stored which, when executed by the processor, performs the steps of the hyperspectral image classification method as described in any of the embodiments of the first aspect.
In a third aspect, the present application also provides a computer program product comprising computer programs/instructions, characterized in that the computer programs/instructions, when executed by a processor, implement the steps of the hyperspectral image classification method according to any one of the embodiments of the first aspect.
Compared with the prior art, the application has the beneficial effects that:
The hyperspectral image classification method provided by the application comprises the steps of firstly carrying out preliminary feature extraction processing on hyperspectral images, preliminarily removing useless information, obtaining a feature image serving as a subsequent processing basis, utilizing the mutual learning and activating characteristics among elements when the feature image is processed by KAN convolution operation, capturing space and spectrum features from the preliminarily obtained feature image efficiently, updating the feature image by utilizing a self-adaptive attention mechanism to amplify the space and spectrum features in the feature image so as to concentrate computing resources on the space and spectrum features, reducing occupation of useless information on the computing resources, realizing the utilization rate of complex information in the hyperspectral images, adopting the Hamming distance-based computation to inquire the similarity between keys by the self-adaptive attention mechanism, reducing the adverse effect of the useless information in the useful information when the feature image carries out refocusing convolution, utilizing the response intensity of each local area in the refocusing convolution self-adaptive adjustment feature image, controlling feature expression, realizing excavation on image depth information, fully utilizing the complex space relation and texture information existing in the hyperspectral images, and improving the classification efficiency and accuracy.
Detailed Description
The following detailed description of the technical solutions of the present application will be given by way of the accompanying drawings and specific embodiments, and it should be understood that the specific features of the embodiments and embodiments of the present application are detailed descriptions of the technical solutions of the present application, and not limiting the technical solutions of the present application, and that the embodiments and technical features of the embodiments of the present application may be combined with each other without conflict.
The term "and/or" is merely an association relationship describing the associated object, and means that three relationships may exist, for example, a and/or B may mean that a exists alone, while a and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.
Embodiment one:
fig. 1 is a flowchart of a hyperspectral image classification method in accordance with the first embodiment of the present invention. The flow chart merely shows the logical sequence of the method according to the present embodiment, and the steps shown or described may be performed in a different order than shown in fig. 1 in other possible embodiments of the invention without mutual conflict.
The hyperspectral image classification method provided by the embodiment can be applied to a terminal, and can be executed by an image classification device, wherein the device can be realized in a software and/or hardware mode, and the device can be integrated in the terminal, such as any smart phone, tablet personal computer or computer equipment with a communication function. Referring to fig. 1, the method of the present embodiment specifically includes the following steps:
The method comprises the steps of obtaining a hyperspectral image, extracting a first characteristic image from the hyperspectral image, primarily processing the hyperspectral image, extracting useful information from different wave bands by combining spatial and spectral characteristics of the image, wherein the useful information mainly comprises information with complex spatial relations and texture information in the hyperspectral image, primarily excluding the useful information, providing a better input source for subsequent KAN convolution operation and calculating similarity between query and key based on Hamming distance, and extracting the first characteristic image in other modes known in the art and not repeated herein.
KAN convolution operation is a novel convolution operation mode, KAN convolution does not simply apply dot products between the kernel and corresponding pixels in an image, but applies a learnable nonlinear activation function to each element, and then adds the elements, so that KAN convolution operation is utilized to replace traditional convolution operation, efficiency of capturing spatial and spectral characteristics can be improved, and complexity and calculation amount of method execution can be reduced without sacrificing model performance.
The method comprises the steps of updating a first feature map by using an adaptive attention mechanism to amplify spatial and spectral features in the feature map, wherein the finally updated first feature map is used as a second feature map, the adaptive attention mechanism is used for calculating the similarity between query and keys based on a Hamming distance, the main effect of the adaptive attention mechanism is to amplify the spatial and spectral features captured by a front KAN convolution operation, the amplified features are easily captured and processed by next refocusing Jiao Juanji, so that the next refocusing convolution is mainly focused on the spatial and spectral features, useless information is removed for the first time when the first feature map is extracted for the first time, and then the useless information is not strong, compared with the method, the method is used for updating the first feature map by using the adaptive attention mechanism, wherein the method is used for reducing the interference of non-important features in the first feature map by amplifying important features, the next refocusing Jiao Juanji is used for discarding the non-important features, the non-important features are more easily removed when the next refocusing convolution is carried out, the first feature map is extracted for the first time, the useless information is removed for the first time when the first feature map is extracted for the first time, the useless information is not strong when the first feature map is processed, the data containing noise or is not completely matched, and the method is very good in comparison with the method, and the method is used for processing the similarity between the first feature map and the first feature map, and the important information is difficult to be changed, and the important, and the method is different from the first self-adaptively important, and the key is high.
The method comprises the steps of carrying out refocusing convolution on a second characteristic image, adjusting the response intensity of a local area in the second characteristic image, amplifying the characteristic expression in the local area according to the response intensity of each local area to obtain a third characteristic image, extracting the first characteristic image preliminarily to eliminate useless information, updating the first characteristic image by a self-adaptive attention mechanism to amplify space and spectrum characteristics, reducing the influence of the useless information of clamping by using Hamming distance calculation instead of traditional similarity calculation, obtaining a second characteristic image with less useless information, cleaning information and having the basis of deep extraction characteristics and extraction detail characteristics, and dynamically adjusting the response of the second characteristic image to different areas through a self-adaptive focusing mechanism by refocusing Jiao Juanji. It is obvious to those skilled in the art that complicated spatial relationship and texture information exist in the hyperspectral image, on one hand, the conventional classification algorithm is difficult to fully utilize the information, on the other hand, the image information has a certain depth, depth details cannot be finely studied, the classification effect can be reduced, in the previous step, useless information amplification image features are eliminated for many times to reduce the information processing difficulty, refocus convolution is carried out on the second feature map, the response intensity to a local area is self-adaptively adjusted, the detail features are captured, and the image information is deeply utilized.
The third feature map finally obtained after refocusing convolution comprises a plurality of features which are deeply excavated and amplified, even if a plurality of features which are similar in appearance in the original hyperspectral image are different in detail and are large in difference after refocusing Jiao Juanji, the features of the features with high identification in the third feature map are obtained, linear combination is carried out, and the features after linear combination are identified to obtain a classification result.
According to the hyperspectral image classification method, firstly, preliminary feature extraction processing is carried out on hyperspectral images, useless information is preliminarily removed, a first feature image serving as a basis for subsequent processing is obtained, the characteristic of mutual learning activation among elements during KAN convolution operation processing is utilized to efficiently capture space and spectrum features from the preliminarily obtained first feature image, a self-adaptive attention mechanism is utilized to update the first feature image and amplify the space and spectrum features in the first feature image, so that calculation resources are concentrated on the space and spectrum features, occupation of the useless information on the calculation resources is reduced, utilization rate of complex information in the hyperspectral images is achieved, similarity between query and keys is calculated by adopting a self-adaptive attention mechanism based on Hamming distance, adverse effects of useless information mixed in the useful information during refocusing convolution on the second feature image can be reduced, response intensity of each local area in the second feature image is adaptively adjusted by utilizing refocusing convolution, excavation of image depth information is achieved by utilizing response intensity control feature expression, complex space relation and texture information in the hyperspectral images are fully utilized, and classification of the third feature image can be accurately combined according to the feature information is improved, and classification performance is improved.
Embodiment two:
The present embodiment provides a hyperspectral image classification method, which is optimized based on the first embodiment to improve the technical effect and refine the technical scheme, and details which are not described in detail in the first embodiment are shown in the first embodiment.
Possibly as a preliminary step in embodiment one, the extraction of the first feature map from the hyperspectral image may also employ the KAN convolution operation mentioned earlier.
The method comprises the steps of scanning each local area in a hyperspectral image by using a filter, extracting key feature positions, carrying out KAN convolution operation on the hyperspectral image aiming at the key feature positions to obtain a first feature image, and carrying out binarization processing on the first feature image by using a pulse activation function in order to reduce the calculated amount and the salient features of the subsequent steps, so that the attention to the key features is enhanced while information volume compression is carried out.
The KAN convolution kernel is responsible for weighted summation of pixels in a specific region and surrounding pixels to obtain local spatial-spectral features, and extracts useful information from different bands by combining the spatial and spectral features of the image. Through the operation, the extracted first feature map contains preliminary local feature information, and a foundation is laid for subsequent feature processing. Referring to an example a in fig. 2, an example a performs an example operation of binarizing a first feature map by a pulse activation function, in which a blue matrix is an input first feature map and a white matrix after the end of the last arrow is an output binarized first feature map.
As one embodiment, the hyperspectral image can be divided into a plurality of small areas patch, and the patch is taken as an input unit to perform preliminary extraction and output of space and spectrum characteristics, so that the occupation of single computing resources can be reduced.
As one embodiment, after the spatial and spectral features in the first feature map are enlarged using the adaptive attention mechanism, noise reduction and feature highlighting may be further performed to couple with the adaptive attention mechanism to the first feature map. Specifically, the first feature map after the adaptive attention mechanism processing is subjected to binarization processing by using a pulse activation function, which is slightly different from the binarization processing mentioned before, and plays a role in further suppressing noise and highlighting key features, namely further amplifying spatial and spectral features on the basis of the adaptive attention mechanism amplification. Referring to example B in fig. 2, example B performs an example operation of the pulse activation function on the binarization processing of the first feature map, but unlike example a, the blue matrix in example B is the input first feature map that has been binarized once, and in example B, the first feature map is binarized again, and output as a white matrix after the end of the last arrow.
As one example, in the pulse activation function represented in FIGS. 1 and 2, 0.7 is used as the activation threshold。
And performing KAN convolution operation on the first feature map again, and recapturing the twice amplified features from the first feature map, so as to obtain the first feature map with prominent spatial and spectral features.
However, in the first feature map obtained after the KAN convolution operation, there may be noise that is entrained and amplified synchronously, so as one embodiment, a pulse activation function is used to perform a second binarization process on the first feature map that highlights the spatial and spectral features, so as to implement noise suppression in the first feature map. The corrected first feature map may have a higher dimension, so as to ensure that the subsequent step can be processed with lower computing resources, the dimension of the first feature map is reduced through multi-scale pooling operation, global information (the global information is different from useful information and useless information) is ensured to be reserved, and the computing burden is reduced, wherein the first feature map after the dimension reduction is obtained through the following formula:
,
In the formula, The first characteristic diagram after pooling, namely the first characteristic diagram after dimension reduction,AndThe coordinates are respectively given as the coordinates,For the purpose of pooling the window size,To pool the count sequence numbers of the traversal elements,The input feature map of the pooling operation is the first feature map after the second binarization processing. The remaining operations related to multi-scale pooling to reduce the dimension of feature maps in this embodiment may be processed with reference to this formula. And finally, obtaining a second feature map according to the first feature map after dimension reduction.
For hyperspectral images with different information depths and complexities, the corresponding second feature map should be subjected to refocusing Jiao Juanji processing for different times, and as one embodiment, the refocusing convolution process is as follows:
Calculating a characteristic value after refocusing convolution by the following formula:
,
In the formula, The eigenvalues after the refocusing convolution,The weight matrix for the refocusing convolution kernel,As a second feature map of the input,AndThe coordinates are respectively given as the coordinates,Is an offset term used to adjust the offset of the output in the refocusing convolution operation.
Refocusing convolution aims to strengthen local features in an image by adaptively adjusting the response intensity of the convolution kernel. According to the characteristic value after refocusing convolution, a second characteristic diagram is recombined, a characteristic value capable of dynamically adjusting a learning parameter is introduced, the characteristic value capable of dynamically adjusting the learning parameter is adopted for channel adjustment, the convolution kernel output of a specific area is controlled by adjusting a channel to be amplified or reduced, so that the response of the specific channel is more obvious, further a detail part in an image is controlled, and the characteristic value after dynamic adjustment is obtained through the following formula:
,
In the formula, Is the firstThe characteristic value of the bar channel after being dynamically adjusted,The channel index is represented as a function of the channel index,Is the firstA learnable parameter of the strip channel for controlling the firstThe intensity of the response of the strip channel,Is the first after refocusing convolutionCharacteristic values of the strip channels;
And (3) reorganizing according to the characteristic value of each channel after the dynamic adjustment to obtain a second characteristic diagram for completing single refocusing convolution.
The introduction of a learnable parameter to dynamically adjust feature values, which is similar to weighting local features, helps the model to better identify important detail features in complex image scenes.
As one of the embodiments, after completing refocusing convolution each time, the second feature map is binarized using a pulse activation function to amplify the feature expression in the local area, ensuring that the locally important features are effectively amplified and expressed.
There is still an optimization space for the second feature map that is amplified by feature expression, and multi-level features can be integrated and optimized by advanced attention mechanisms and global feature optimization, and further reduced in dimension. The process mainly comprises the steps of updating a second feature map amplified by feature expression by using an adaptive attention mechanism, calculating similarity between a query and a key based on a hamming distance, optimizing key features in the second feature map according to the similarity between the query and the key, optimizing the key features in the second feature map according to the similarity between the query and the key, aiming at calculating interaction between high-level features (or called key features) based on the attention mechanism of the hamming distance, optimizing the key features and filtering useless information mixed in the high-level features.
The optimized second feature map can be subjected to depth convolution processing by utilizing refocusing convolution, wherein multi-level features are subjected to depth convolution processing, and depth features in the image are integrated and optimized.
The dimension of the second feature map after the processing is higher, so that the dimension of the second feature map after refocusing convolution updating is reduced through multi-scale pooling operation, and the dimension-reduced second feature map is obtained through the following formula:
,
In the formula, The second characteristic diagram after pooling, namely the second characteristic diagram after dimension reduction,For the purpose of pooling the window size,To pool the count sequence numbers of the traversal elements,And inputting a second characteristic diagram for pooling operation, namely, performing refocusing convolution updating.
Feature dimensions are reduced through global pooling operation, and global key information of images is maintained.
In a first embodiment, the features in the third feature map are linearly combined, and the features after being linearly combined are identified to obtain the classification result, which in this embodiment, includes,
The linearly combined features are obtained according to the following equation:
,
In the formula, For the weight matrix of the full connection layer in the adaptive attention mechanism,The characteristic diagram after the multi-scale pooling operation, namely a third characteristic diagram,As a result of the bias term,Is a linearly combined feature.
In addition, most convolution processes are unified into the same or similar KAN convolution, so that the utilization rate of the same module is improved, and the storage consumption and the software optimization difficulty are reduced.
In this embodiment, a number of KAN convolution operations are mentioned, and referring to fig. 3, in this embodiment, the KAN convolution operation formula related to this embodiment is:
,
In the formula, AndThe coordinates are respectively given as the coordinates,In order to be in the spectral dimension,Is given by the coordinatesAndSpectral dimensions ofThe convolution results, such as the first feature map after KAN convolution operation,A weight matrix that is a KAN convolution kernel,AndThe coordinates of the convolution kernels respectively,Representing the sliding operation of the KAN convolution kernel,For the spectral dimension of the convolution kernel,For coordinates and spectral dimensions ofPixel values at;
The first convolution kernel of the KAN convolution operation uses a gaussian radial basis function GaussianRBF convolution kernel that performs the convolution by measuring the distance of the input data points from the center of the convolution kernel and weighting and summing those data. The kernel function is particularly suitable for continuous spectrum data special in hyperspectral images, and can effectively extract spectrum and space characteristics.
In the description of the present embodiment, the KAN convolution is preferably a fast KAN convolution.
For the combination of KAN convolution and impulse neural network, the fast KAN convolution is adopted, the convolution operation efficiency is improved, and the feature extraction accuracy is further optimized through the impulse activation function in the impulse neural network. The KAN convolution can efficiently extract space and spectrum information in the hyperspectral image under limited computing resources, and the attention of the method to important features is improved through a pulse activation function.
The present embodiment also uses the feature map of the binarization processing of the pulse activation function multiple times, for example, the feature map of the first feature map is binarized by the pulse activation function, and as one embodiment, the binarized feature map is obtained by calculating the following formula:
,
In the formula, Is given by the coordinatesAndSpectral dimensions ofThe pulse signal, such as the first characteristic diagram after binarization processing,Representation ofThe latter inequality is the execution condition,To activate the threshold, it is determined whether to generate a pulse signal.
Conventional attention mechanisms typically rely on continuous value similarity calculations, such as dot product or euclidean distance, to calculate the similarity between a query vector (Q) and a key vector (K). However, when applied in a pulsed neural network, this mechanism exposes the following problems:
First, continuous value similarity is not applicable to pulse signals, where the signals in the pulse neural network are binary 0's or 1's, and conventional continuous value similarity calculation cannot effectively process such discrete signals, resulting in a decrease in accuracy of the similarity calculation.
Secondly, the calculation complexity is high, and when the pulse signal is processed, the traditional attention mechanism still uses complex continuous value calculation, and particularly under the condition of high dimension, a large amount of calculation resources are needed, so that the processing efficiency is seriously affected.
Finally, the feature expression capability is insufficient, and the traditional mechanism is difficult to capture the fine similarity among the features due to the discrete characteristics of the pulse signals, so that the classification performance is influenced, and the feature expression capability is poor particularly in the complex feature extraction task of hyperspectral images.
To overcome the above limitation, referring to fig. 4, the present embodiment proposes a binary hamming kernel attention mechanism, which is designed specifically for processing pulse signals, and can accurately measure the similarity between query vectors and key vectors and optimize the feature representation by calculating the hamming distance between pulse signals, and in fig. 4, softmax is an exemplary nature of the activation function. Specifically, the hamming core attention mechanism has the following advantages:
firstly, the similarity calculation of the pulse signals is adapted, and the Hamming distance calculation is based on the exclusive OR operation of binary signals and is more suitable for processing the pulse signals, unlike the traditional continuous value similarity calculation. By calculating the hamming distance between the pulse signals, the mechanism can effectively measure the discrete similarity between the query vector (Q) and the key vector (K), thereby enhancing the characteristic representation capability of the pulse neural network.
And secondly, the calculation complexity is low, the Hamming distance is calculated very efficiently, complex product and normalization operations in a continuous value attention mechanism are avoided, the calculation cost of a model is obviously reduced, and the model is excellent in processing a high-dimensional pulse signal.
And finally, strengthening key features, namely, through similarity calculation of binary Hamming distances, the attention mechanism can more accurately identify and strengthen the key features in the image, so that the classification performance of the model under high-dimensionality and complex data is greatly improved.
In addition, the innovative design of KAN convolution and self-adaptive convolution is combined, the embodiment effectively solves the complexity problem of hyperspectral images, and the classification accuracy and efficiency are further improved.
Thus, the present embodiment updates the feature map using the adaptive attention mechanism that calculates the similarity based on the hamming distance multiple times, for example, updates the first feature map using the mechanism, and as one embodiment, the general use procedure of the adaptive attention mechanism in the present embodiment includes,
Performing linear transformation on the first feature map to obtain a query vector, a key vector and a value vector;
similarity between the query vector and the key vector is calculated by hamming distance, and the similarity is obtained by the following formula:
,
In the formula, For the calculation operation of the hamming distance, the hamming distance is used to measure the different numbers of bits of two binary vectors (i.e. the query vector and the key vector) (i.e. the exclusive-or operation between the corresponding elements of the two vector positions),In order to query the vector of the vector,In the form of a key vector,In order to query the count number of the vector,For the count number of the key vector,Is the firstThe first of the query vectorsThe number of elements to be added to the composition,Is the firstThe first key vectorThe number of elements to be added to the composition,Representing query vectorsAnd key vectorThe hamming distance between them, i.e. the similarity,Representing the exclusive or operation of binary vectors, representing that two elements are different when the calculation result is 1, representing that two elements are identical when the calculation result is 0,As the length of the element of the vector,A count sequence number for the element;
based on the similarity between the query vector and the key vector, an attention weight is generated, which is obtained by the following calculation:
,
In the formula, The count number for the similarity is the number of counts,Is the firstSum of query vectorsAttention weight of individual key vectors, notably,For attenuating hamming distances, the smaller the distance, the greater the weight;
generating a new feature vector according to the attention weight and the value vector, obtaining a first feature map according to the new feature vector, and obtaining the feature vector through the following calculation:
,
In the formula, The feature vector after the update is shown,The count number of the representative value vector,Is a vector of and keyA corresponding value vector.
The remaining feature maps may be processed with reference to the above-described procedure.
It is noted that, in the description of the present embodiment, the pulse neural network performs layer-by-layer feature extraction and classification in combination with the multi-step pulse signal, and uses the pooling layer to perform downsampling on the features, thereby improving classification accuracy and efficiency. And the pulse neural network is used for carrying out high-efficiency classification by combining the result of the attention mechanism after extracting key features from the pulse signals through the feature extraction layer.
To optimize feature similarity computation in hyperspectral images, this embodiment introduces a hamming distance attention mechanism based on binarized pulse signals. Unlike conventional continuous value similarity calculations, hamming distances can better handle pulse signals, generating attention weights by calculating binary similarity between query vectors and key vectors.
In the embodiment, the local features are adaptively adjusted through the refocusing Jiao Juanji, the response intensity of different areas of the convolution check is dynamically adjusted, and effective extraction of complex detail features is ensured. Through the self-adaptive focusing mechanism, the model can focus on the local complex region, so that the depth and accuracy of feature extraction are improved.
In the embodiment, the characteristic extraction capability of the hyperspectral image is remarkably improved by introducing a pulse neural network (SNN) and combining the innovative design of fast Kolmogorov-Arnold (KAN) convolution and refocusing Jiao Juanji. The impulse neural network simulates impulse response characteristics of biological neurons, can efficiently process space-time data, dynamically adjusts response of a model to complex features, and ensures high classification accuracy. Fast KAN convolution is responsible for efficient extraction of spatial and spectral features, while refocusing convolution strengthens the capture and optimization of local detail by an adaptive mechanism. Meanwhile, the binarization pulse coding generates a pulse sequence with high fidelity by carrying out automatic coding on each pixel, so that the efficiency and the processing precision of information transmission are greatly improved, and the performance of high-efficiency learning and accurate classification can be maintained particularly under the condition that training samples are scarce.
According to the embodiment, through experimental verification, houston University 2013 is selected as a hyperspectral image dataset, houston University 2013 is taken as a hyperspectral image real ground object image in the hyperspectral image dataset, figure 6 is a hyperspectral image pseudo-color image of figure 5, the size is 349×1905 pixels, a large number of background pixels are contained, the total number of pixels containing ground objects is 15029, and 15 types of ground objects including trees, asphalt roads, bricks, pastures and the like are contained in the pixels. 100 samples of each type are selected for training, and the rest tests show that the training samples account for nearly 10% of the total samples.
The hyperspectral image classification method provided by the embodiment is named KSANet, is used as comparison, uses SVM, 2D-CNN, 3D-CNN, spectralFormer and SNN in the prior art as similar comparison items, and comprises the accuracy of nine ground object categories with sequence numbers from 1 to 15, and further comprises the steps of obtaining an overall accuracy, average accuracy and KaKappa coefficient according to the accuracy of the nine ground object categories from 1 to 15, and obtaining an accuracy comparison table of the different image classification methods as follows.
Table 1 comparison table of accuracy of different image classification methods
As can be seen from Table 1, the overall accuracy of KSANet provided by this example is up to 98.96%, which is significantly higher than other methods, such as 93.19% for SVM and 97.23% for 2D-CNN. This indicates KSANet has higher accuracy in processing hyperspectral image classification tasks. Especially, KSANet showed a larger boost (91.82% vs 98.96%) compared to 3D-CNN, which shows the advantage of KSANet in extracting spatio-spectral features. Of the multiple categories KSANet achieves 100% classification accuracy, for example in class 5 (Bitumen), class 6 (Trees) and class 15 (Running Track). This remarkable classification performance is particularly manifested in complex and blurry terrain, showing a high sensitivity of KSANet to detail features. In contrast, 3D-CNN performs significantly below KSANet in multiple categories, such as category 2 (90.83%) and category 8 (79.94%), showing the limitations of conventional convolutional neural networks in high-dimensional data and complex features.
The experiment can be used for concluding that the classification accuracy of the hyperspectral image classification method provided by the embodiment is obviously higher than that of a bid, the classification efficiency is improved on the basis of improving the classification accuracy, and the complex spatial relationship and texture information existing in hyperspectral images are well utilized to perform high-efficiency and high-accuracy image recognition.
In order to visualize the classification results, fig. 7 and 8 are respectively classification result diagrams shown by SVM and 3D-CNN, fig. 9 shows a KSANet classification result diagram, and the classification results shown in fig. 7, 8 and 9 can clearly verify the conclusion of the experiment with table 1.
The embodiment provides an effective hyperspectral image classification method by introducing a pulse neural network, a binary Hamming kernel attention mechanism, a refocusing Jiao Juanji and a fast KAN convolution, and can remarkably improve the classification precision and efficiency under the conditions of complex data and fewer samples. The innovative method is excellent in remote sensing image analysis, and can be applied to various actual scenes such as precise agriculture, environment monitoring and the like.
Embodiment III:
The present embodiment provides a computer device comprising a processor and a memory connected to the processor, wherein a computer program is stored in the memory, and when the computer program is executed by the processor, the steps of the hyperspectral image classification method as provided in the first or second embodiment are performed.
The computer device may be a server or an electronic terminal, and as one embodiment, referring to fig. 10, the computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing data acquired and generated in the hyperspectral image classification method. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement the hyperspectral image classification method provided in the first or second embodiment.
It will be appreciated by those skilled in the art that the structure shown in FIG. 10 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
The computer device provided in this embodiment has the same technical effects as those of the first or second embodiment, and will not be described herein.
Embodiment four:
the present embodiment provides a computer program product having a computer program stored thereon, which when executed by a processor, implements the steps of the hyperspectral image classification method provided in the first or second embodiments. The computer program product provided in this embodiment may be transmitted, distributed, and downloaded in the form of signals through the internet.
The computer program product provided in this embodiment has the same technical effects as those in the first or second embodiment, and will not be described herein.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In the description of the present invention, unless otherwise indicated, the meaning of "a plurality" is two or more.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.