Disclosure of Invention
Aiming at the problems and the technical requirements, the application provides a knob switch state identification method based on a twin neural network, and the technical scheme of the application is as follows:
a knob switch state identification method based on a twin neural network comprises the following steps:
acquiring an original preset bit image, wherein the original preset bit image comprises a knob image and a background image of an area where a knob switch in a state to be identified is located;
performing image preprocessing on the original preset position image, and extracting to obtain a knob image which is subjected to image correction in the original preset position image and is used as an image to be identified, wherein the image to be identified is a positive angle image which is subjected to image correction on a knob switch in a state to be identified;
traversing all the knob switch standard images in the knob switch standard library in sequence, and inputting each traversed knob switch standard image and the image to be identified into a similarity calculation model trained in advance at the same time to obtain the image similarity between the knob switch standard image and the image to be identified; the standard knob switch library comprises knob switch standard images of various knob switches in various gear states, and each knob switch standard image is a positive angle image of an area where the knob switch is located, wherein the positive angle image is acquired by a shooting angle opposite to the knob switch; the similarity calculation model is constructed and trained on the basis of a twin neural network in advance;
and obtaining the gear state of the knob switch in the state to be identified according to the image similarity between the image to be identified and the knob switch standard images in different gear states.
The further technical scheme is that the knob switch state identification method further comprises the following steps:
shooting original sample images of various types of knob switches in various gear states from various different shooting angles respectively, extracting an image of an area where the knob switch is located from each original sample image, marking the gear state of the corresponding knob switch as a training sample, and constructing a training data set;
the network framework for constructing the similarity calculation model comprises a twin neural network, a full-connection layer and a Softmax layer, wherein the twin neural network comprises two feature extraction modules which are connected together and share network parameters, and each feature extraction module takes ResNet50 as a basic network structure and introduces a CBAM module to learn an attention mechanism;
and performing model training by using a training data set based on a network frame of the similarity calculation model to obtain a trained similarity calculation model, wherein two feature extraction modules in the similarity calculation model respectively and independently acquire one input image and output feature images, and the twin neural network connects the feature images output by the two feature extraction modules together to form final feature representation, and calculates Euclidean distance after sequentially passing through a full-connection layer and a Softmax layer to obtain the image similarity between the two images.
The further technical scheme is that the gear state of the knob switch of the state to be identified comprises:
after the image similarity of each knob switch standard image and the image to be identified is obtained respectively, calculating the average value of the image similarity of the knob switch standard image and the image to be identified with the same gear state, and taking the average value as the corresponding gear similarity of the gear state;
and taking the gear state with the highest corresponding gear similarity as the gear state of the knob switch in the state to be identified.
The further technical scheme is that the knob image after image correction in the original preset position image is extracted and obtained as an image to be identified comprises the following steps:
determining a preset bit image template corresponding to a preset shooting position of an original preset bit image, wherein different preset shooting positions are provided with different preset bit image templates, and the preset bit image template corresponding to each preset shooting position indicates the area where a knob switch in the image shot at the preset shooting position is located;
and carrying out image preprocessing on the original preset bit image by using the preset bit image template obtained through determination, and extracting to obtain an image to be identified.
The further technical scheme is that the image preprocessing of the original preset bit image by using the preset bit image template obtained by determination comprises the following steps:
respectively carrying out feature extraction on the preset bit image template and the original preset bit image by using a convolutional neural network to obtain respective feature images;
performing feature matching on the two feature images to align the original preset bit image with the image of the preset bit image template, and extracting a knob image of an area where a knob switch in the state to be identified is located in the original preset bit image by combining the area where the knob switch in the preset bit image template is located;
and carrying out image correction on the knob image obtained by extraction, and extracting to obtain an image to be identified.
The further technical scheme is that the extracting and obtaining the knob image of the area where the knob switch of the state to be identified is located in the original preset position image comprises the following steps:
detecting image feature points of a preset bit image template based on a feature map of the preset bit image template, and detecting image feature points of an original preset bit image based on a feature map of the original preset bit image;
filtering image feature points of the preset bit image template and image feature points of the original preset bit image by using a random sampling consistency algorithm, screening to obtain four pairs of image feature points which are optimally matched, wherein each pair of image feature points comprises one image feature point in the preset bit image template and one image feature point in the matched original preset bit image;
according toSolving according to the coordinates of four pairs of image feature points to obtain a transformation matrix +.>,/>And->Is the coordinates of a pair of image feature points, +.>Andis the coordinates of a pair of image feature points, +.>And->Is the coordinates of a pair of image feature points,and->Is the coordinates of a pair of image feature points, and +.>、/>、/>Andare all image feature points in the original preset bit image, < >>、/>、/>、All are image feature points in a preset bit image template;
using a transformation matrixAnd carrying out coordinate transformation on the position coordinates of the area where the knob switch is located in the preset bit image template to obtain the position coordinates of the area where the knob switch in the state to be identified is located in the original preset bit image, and extracting to obtain the knob image.
The further technical scheme is that the image correction of the extracted knob image comprises the following steps:
determining coordinates of four vertices of a knob image in an original preset bit image、、/>And->;
Calculating according to the coordinates of four vertexes to obtain vertexesAnd vertex->European distance between->Top>And vertex->European distance between->Top>And vertex->European distance between->Top>And vertex->European distance between->;
Determining coordinates of four vertexes of the transformed image as、/>、/>And->Wherein->Is->And->Maximum value of>Is->And->Maximum value of (2);
according toSolving to obtain a transformation matrix->And utilize the transformation matrix->And carrying out image transformation on the knob image in the original preset position image to obtain an image to be identified.
The further technical proposal is that a transformation matrix is utilizedThe step of carrying out image transformation on the knob image in the original preset bit image to obtain an image to be identified comprises the following steps:
using a transformation matrixPerforming image transformation on the knob image in the original preset position image to obtain an output image, wherein the output image is a positive angle image of a knob switch in a state to be identified;
calculating a gray level histogram of each pixel point in the output image, and calculating a cumulative distribution function of each gray level in the gray level histogram, and arbitrary gray levelsCumulative distribution function +.>Equal to gray level +.>Sum of number of pixels in the range, < ->,/>Is the most gray levelSmall value (S)>Is the maximum of the gray levels;
according toThe arbitrary gray level in the output image is +.>The gray level of the pixel of (2) is converted to +.>,/>Is the total number of pixels contained in the output image,representation pair->Rounding is performed.
The further technical scheme is that detecting the image feature points of the preset bit image template and detecting the image feature points of the original preset bit image comprises inputting any one of the image of the preset bit image template and the original preset bit image:
extracting each key pixel point and descriptors of each key pixel point in a feature map of an input image, wherein each key pixel point has a maximum spatial feature value in a preset local neighborhood range and a maximum channel feature value in a channel direction; the descriptor of each key pixel point is a channel direction vector of the position of the key pixel point;
and recovering each key pixel point to the image size of the input image through bilinear interpolation according to the extracted key pixel points and the descriptors of the key pixel points, and extracting the image feature points of the input image.
The method comprises the further technical scheme that a convolutional neural network is utilized to respectively conduct feature extraction on a preset bit image template and an original preset bit image to obtain respective feature images, and the method comprises the steps that any one of the preset bit image template and the original preset bit image is input with:
performing convolution and maximum pooling combination downsampling on an input image for two times to obtain a deep feature image of the input image, wherein the image size of the deep feature image is 1/4 of the image size of the input image;
and performing convolution, average pooling and cavity convolution on the deep feature map again, and fusing and extracting to obtain the feature map of the input image.
The beneficial technical effects of this application are:
the method utilizes a similarity calculation model based on the twin neural network to learn and express the state characteristics of the knob switch and the difference between gear states, can effectively reduce the influence brought by complex background, illumination and changeable shooting angles in the image preprocessing process of an original preset position image, realizes state recognition by determining the image similarity between a knob switch standard image and the image to be recognized after the image to be recognized is extracted, has better recognition accuracy and robustness, is very suitable for the fields of automatic control, intelligent transformer substations and the like, and is beneficial to improving the efficiency and safety of equipment monitoring and management.
When the similarity calculation model is used for constructing the twin neural network, resNet50 is taken as a basic network structure, and a CBAM attention mechanism is added, so that the deep convolution neural network can concentrate on the extraction of knob features, and the feature extraction effect is better.
Detailed Description
The following describes the embodiments of the present application further with reference to the accompanying drawings.
The application discloses a knob switch state identification method based on a twin neural network, please refer to a flow chart shown in fig. 1, the knob switch state identification method comprises the following steps:
and step 1, acquiring an original preset bit image.
In the application scene of the transformer substation, a plurality of cameras are fixed at a plurality of different positions of the transformer substation, each camera can rotate to a plurality of different target angles to shoot images within a view field range, the shooting position of each camera at each target angle is a preset shooting position, and the image acquired by the camera at one preset shooting position is an original preset position image in the application. Since the fixed position of each camera is predetermined and the respective target angle at which each camera is rotatable is also predetermined, all preset shots present in the entire substation can be determined, and each original preset image acquired in this application is taken at one of the preset shots in the substation, and it can be determined in particular at which preset shot.
Because the environment of the transformer substation is complex, the original preset bit image shot at each preset shooting position often comprises other equipment in the transformer substation besides the knob switch, so that each obtained original preset bit image comprises a background image besides the knob image of the area where the knob switch in the state to be identified is located.
And 2, performing image preprocessing on the original preset bit image, and extracting to obtain a knob image which is subjected to image correction in the original preset bit image as an image to be identified.
Because the types and the number of the rotary switches included in the transformer substation are large, it is generally impossible to arrange a camera for each rotary switch to shoot the rotary switch, so that the obtained original preset bitmap image is often not an image obtained by shooting the rotary switch in a state to be identified, but an image shot from various non-uniform inclination angles, and the original preset bitmap image is also influenced by other environmental factors.
Therefore, after the original preset position image is acquired, image preprocessing is firstly carried out to make up for the interference caused by the shooting angle and the shooting environment, and the extracted image to be identified is a positive angle image of the knob switch in the state to be identified after image correction.
And step 3, sequentially traversing all the knob switch standard images in the knob switch standard library, and inputting each traversed knob switch standard image and the image to be identified into a similarity calculation model which is obtained by training in advance at the same time to obtain the image similarity between the knob switch standard image and the image to be identified.
The standard library of the rotary switches is pre-constructed, the standard library of the rotary switches comprises rotary switch standard images of various rotary switches in various gear states, and each rotary switch standard image is a positive angle image of an area where the rotary switch is located, wherein the positive angle image is acquired from a shooting angle opposite to the rotary switch. Each knob switch has a plurality of different gear states according to different gear rotation angles, each gear state covers one gear rotation angle range of the knob switch, the gear states are divided in advance according to actual conditions, for example, the gear states are obtained by dividing the gear states according to the gear rotation angle range of every 30 degrees from 0 degree, and the gear states can be actually defined and divided. When the knob switch standard library is constructed, the knob switches of all types in the transformer substation are respectively adjusted to different gear states, a knob switch standard image is obtained by shooting the knob switches by using a camera in each gear state, the gear states are switched to shoot again, and after the knob switch standard image of the knob switch in all gear states is obtained, the knob switches of other types are operated in the same way, so that the knob switch standard image of all types of knob switches in the transformer substation in all gear states is obtained.
The step also needs to use a similarity calculation model which is constructed and trained in advance based on the twin neural network, and a training method thereof is described later.
And 4, obtaining the gear state of the knob switch in the state to be identified according to the image similarity between the image to be identified and the knob switch standard images in different gear states.
In one embodiment, the gear state of the knob switch standard image with the highest image similarity with the image to be identified is used as the gear state of the knob switch of the state to be identified.
However, in order to improve the recognition accuracy, in another embodiment, after the image similarity between each of the standard images of the rotary switch and the image to be recognized is obtained, an average value of the image similarity between each of the standard images of the rotary switch and the image to be recognized, which has the same gear state, is calculated as the gear similarity corresponding to the gear state. After the gear similarity of each gear state is obtained, the gear state with the highest corresponding gear similarity is used as the gear state of the knob switch in the state to be identified.
In the application of step 3, a similarity calculation model is needed, so the method further includes a method for pre-training the similarity calculation model, including the following steps, please refer to the flowchart shown in fig. 2:
1. and respectively shooting original sample images of the knob switches of various types in various gear states from various different shooting angles, extracting an image of an area where the knob switches are positioned from each original sample image, marking the gear states of the corresponding knob switches as training samples, and constructing a training data set.
The method for constructing the training data set is similar to the method for constructing the knob switch standard library, and in practical application, the constructed training data set contains all knob switch standard images in the knob switch standard library, so that the two parts share one process, and the knob switch standard library can be constructed while the training data set is constructed.
The difference from building a standard library of rotary switches is that, for each type of rotary switch in a substation, in the case of adjusting the rotary switch to each gear state, the rotary switch is photographed at a plurality of different photographing angles by means of a camera in addition to photographing the rotary switch by means of the camera.
A space three-dimensional virtual coordinate system is established by the knob switch, an x0y plane of the space three-dimensional virtual coordinate system is parallel to a horizontal plane, a z-axis direction is vertical to the horizontal plane upwards, a y-axis direction faces to the front of the knob switch, and an x-axis direction is vertical to the y-axis direction. The plurality of different photographing angle photographing includes a horizontal photographing angle having a different x-axis direction with respect to the knob switch, and a vertical photographing angle having a different z-axis direction with respect to the knob switch, and a different front-to-rear distance along the y-axis direction with respect to the knob switch. In one embodiment, the knob switches are sequentially moved stepwise in the x-axis direction by 30 ° to 150 ° with respect to the x-axis positive direction and photographed with each kind of knob switch in each gear state, each time stepwise 30 °. And sequentially moving the knob switch in a stepping manner along the z-axis direction within the range of 30-150 degrees of the included angle of the positive direction of the z-axis by utilizing the dome camera, and shooting the knob switch in a stepping manner each time by 30 degrees. And sequentially moving the knob switch in a stepping manner along the y-axis direction by using the dome camera in a range of 0.5m to 2m from the front to back of the knob switch, wherein the stepping length is 0.5m.
After each original sample image is obtained through shooting, an image of the area where the knob switch is located is extracted from the original sample image, a background image is removed, and then a corresponding gear state is marked.
2. The network framework for constructing the similarity calculation model comprises a twin neural network, a full-connection layer and a Softmax layer, wherein the twin neural network comprises two feature extraction modules which are connected together and share network parameters, and the two feature extraction modules share the network parameters so as to increase network efficiency and reduce the number of parameters. Each feature extraction module takes ResNet50 as an underlying network structure and introduces the learning of the attention mechanism by the CBAM module.
3. And performing model training by using a network framework of the training data set based on the similarity calculation model to obtain a trained similarity calculation model.
The two feature extraction modules in the similarity calculation model respectively and independently acquire an input image and output feature images, the twin neural network connects the feature images output by the two feature extraction modules together to form final feature representation, and the Euclidean distance is calculated after the final feature representation sequentially passes through the full connection layer and the Softmax layer to obtain the image similarity between the two images.
The similarity calculation model is subjected to cross matching training by using training samples of different types of knob switches in the training data set under different gear states and different shooting angles, so that the similarity calculation model has stronger feature extraction capability and robustness.
After the knob switch standard image and the image to be identified are obtained by the similarity calculation model in the step 3, firstly, the knob switch standard image and the image to be identified are subjected to standardized processing to be identical in image shape and image size, then, the two images are sent into the feature extraction module to be transmitted forwards, the two images respectively pass through a sharing layer and an independent layer of the twin neural network, so that respective feature images are obtained, and feature vectors at the stage are high-level representations of the images and are used for describing features and structures of the images. And finally, connecting the feature images of the two images together to form final feature representation of the twin neural network, and performing Euclidean distance calculation after passing through a full-connection layer, a Softmax layer and other subsequent processing modules to obtain the image similarity of the knob switch standard image and the image to be identified.
In one embodiment, when the step 2 performs image preprocessing on the original preset bit image, a preset bit image template corresponding to a preset shooting position of the original preset bit image is determined first, and then the determined preset bit image template is used for performing image preprocessing on the original preset bit image, so as to extract an image to be identified. Different preset shooting positions are provided with different preset image templates, the preset image template corresponding to each preset shooting position indicates the area where a knob switch is located in an image shot at the preset shooting position, and the preset image template corresponding to each preset shooting position is predetermined.
The image preprocessing of the original preset bit image by using the preset bit image template obtained by determination includes the following steps, please refer to the flowchart shown in fig. 3:
1. and respectively carrying out feature extraction on the preset bit image template and the original preset bit image by using a convolutional neural network to obtain respective feature images.
In order to capture key information in an image, for any one of a preset bit image template and an original preset bit image, extracting a feature map of the input image includes: (1) And carrying out convolution twice and maximum pooling combined downsampling on the input image to obtain a deep feature image of the input image, wherein the image size of the deep feature image is 1/4 of the image size of the input image. This prevents the loss of excessive spatial features during the convolution feature extraction process. (2) The operations of convolution, average pooling and cavity convolution are carried out on the deep feature map again so as to fuse and extract sparse features and obtain a larger receptive field, thereby extracting the feature map of the input image, and further acquiring more global image information and features with more identification degree.
2. And performing feature matching on the two feature images to realize the image alignment of the original preset bit image and the preset bit image template, and extracting the knob image of the region where the knob switch in the state to be identified is located in the original preset bit image by combining the region where the knob switch in the preset bit image template is located. Comprising the following steps:
(1) Image feature points of the preset bit image template are detected based on the feature images of the preset bit image template, and image feature points of the original preset bit image are detected based on the feature images of the original preset bit image.
In one embodiment, for any one of the input image of the preset bit image template and the original preset bit image, detecting the image feature points of the input image based on the feature map of the input image includes:
extracting each key pixel point and descriptors of each key pixel point in a feature map of an input image, wherein each key pixel point has a maximum spatial feature value in a preset local neighborhood range and a maximum channel feature value in a channel direction. The descriptors of each key pixel point are channel direction vectors of the positions of the key pixel points, and the descriptors have richer and comprehensive information, and can better resist interference such as scale, rotation, illumination, visual angle, non-rigid transformation and the like compared with the traditional characteristic point extraction algorithm.
Extracting key pixel pointsCan be expressed as +.>Wherein->Representing arbitrary pixel point in the feature map>Is>A set of pixels corresponding to a maximum value of the spatial feature values within a predetermined local neighborhood range +.>Can be customized, such as 3*3 neighborhood, and->Representing the output characteristic diagram->At->All pixels of the location. />Representing the index of a characteristic channel in a characteristic map +.>A set of pixel points corresponding to a maximum value of channel characteristic values of the channel direction, +.>Representing the output characteristic diagram->Index>Is a pixel in the channel direction. Wherein the symbols are: indicating that the parameter is default.
And then, according to the extracted key pixel points and descriptors of the key pixel points, restoring the key pixel points to the image size of the input image through bilinear interpolation, and extracting the image feature points of the input image.
(2) And filtering the image feature points of the preset bit image template and the image feature points of the original preset bit image by using a random sampling consistency algorithm, and screening to obtain four pairs of image feature points which are optimally matched, wherein each pair of image feature points comprises one image feature point in the preset bit image template and one image feature point in the matched original preset bit image.
(3) According toSolving according to the coordinates of four pairs of image feature points to obtain a transformation matrix +.>Transform matrix->Is a 3*3 matrix.
Wherein,and->Is the coordinates of a pair of image feature points, +.>And->Is the coordinates of a pair of image feature points, +.>And->Is the coordinates of a pair of image feature points, +.>Andis the coordinates of a pair of image feature points, and +.>、/>、/>And->Are all image feature points in the original preset bit image, < >>、/>、/>、/>Are image feature points in the preset bit image template.
(4) Using a transformation matrixAnd carrying out coordinate transformation on the position coordinates of the area where the knob switch is positioned in the preset bit image template to finish the image alignment of the original preset bit image and the preset bit image template.
After the original preset bit image is aligned with the image of the preset bit image template, the position coordinates of the area where the knob switch in the state to be recognized is located in the original preset bit image can be correspondingly obtained and the knob image can be extracted because the area where the knob switch in the preset bit image template is located is known.
It is more common practice to use a transformation matrix, where the coordinates of the four vertices of the rectangular area in which the knob switch is located in the preset bit image template are knownAnd carrying out coordinate transformation on the four vertex coordinates to obtain coordinates of four vertexes of an area where the knob switch in the state to be identified is located in the original preset position image, then sequentially connecting the coordinates of the four vertexes to determine the area where the knob switch in the state to be identified is located, and intercepting the image in the area to obtain the knob image.
3. And carrying out image correction on the knob image obtained by extraction, and extracting to obtain an image to be identified.
When the original preset position image is shot on the knob switch in the state to be identified, the boundary of the knob image extracted here forms a rectangular structure, but due to the problem of shooting angle, the boundary of the knob image extracted here is often not rectangular, usually is an irregular quadrangle, and when the image correction is carried out, firstly, the irregular boundary of the knob image needs to be converted into the rectangle, so that the knob image is restored to a form under positive angle shooting as much as possible, and the influence of the shooting angle on the identification is reduced. The method comprises the following steps:
(1) Determining coordinates of four vertices of a knob image in an original preset bit image、、/>And->. As shown in fig. 4, a quadrangle formed by four vertexes of the knob image is generally irregular. Then the vertex +_ can be calculated according to the coordinates of the four vertices>And vertex->European distance between->Top>And vertex->European distance between->Top>And vertex->European distance between->Top>And vertex->European distance between->。
(2) Determining coordinates of four vertexes of the transformed image according to Euclidean distance between two adjacent vertexes of the knob image as respectively、/>、/>And->Wherein->Is->And->Maximum value of>Is->And->Is the maximum value of (a).
(3) According toSolving to obtain a transformation matrix->Transform matrix->Is a 3*3 matrix and transforms the matrix +.>The value range of each element is [1,3 ]]。
(4) Using a transformation matrixAnd carrying out image transformation on the knob image in the original preset position image to obtain an image to be identified. This step is completed by using OpenCV library function warp select ().
In this step (4), a transformation matrix is usedThe output image obtained by performing image transformation on the knob image in the original preset position image is a positive angle image of the knob switch in the state to be identified, namely the original knob image is restored to a form of shooting under the positive angle as far as possible, so that the influence of the shooting angle on the state identification is effectively reduced, but considering that the complex environment of the transformer substation also often has the problem of uneven illumination, in order to further reduce the influence of uneven illumination on the state identification, in one embodiment, the output image is not directly used as the image to be identified, but the remapping process of the image to be identified is further included, and the remapping process comprises the following steps:
(5) Calculating a gray level histogram of each pixel point in the output image, and calculating a cumulative distribution function of each gray level in the gray level histogram, and arbitrary gray levelsCumulative distribution function +.>Equal to gray level +.>Sum of number of pixels in the range, < ->,/>Is the minimum of the gray level, +.>Is the maximum of the gray levels.
(5) According toThe arbitrary gray level in the output image is +.>The gray level of the pixel of (2) is converted to +.>,/>Is the total number of pixels contained in the output image,representation pair->Rounding is performed.
Therefore, the positive angle image of the knob switch in the state to be identified after image correction can be extracted, and the image to be identified, which reduces the influence of shooting angle and uneven illumination, is obtained.
What has been described above is only a preferred embodiment of the present application, which is not limited to the above examples. It is to be understood that other modifications and variations which may be directly derived or contemplated by those skilled in the art without departing from the spirit and concepts of the present application are to be considered as being included within the scope of the present application.