Identification method and device for shielding target object
Technical Field
The invention relates to the technical field of object identification, in particular to an identification method and device for an occlusion target object.
Background
With the rapid development of computer networks, the image recognition of objects plays an important role in the communication of the Internet of things, the object images are accurately recognized, and various aspects can be brought to the daily life of people. For example: criminals can be searched and caught through image recognition, and target objects can be tracked through image recognition. However, in the process of image recognition, there are some occlusion factors that affect the recognition result of the object image, so the degree of the effect of some occlusion factors on the recognition result needs to be considered in the process of recognizing the object image.
In the prior art, a YOLO model or a HOG and SVM model and other artificial intelligent algorithms are mainly used for intelligently identifying a target object, that is, two adjacent frames of video image data are input into the YOLO model or the HOG and SVM model, and whether the two frames of video image data are the same object is identified through the intersection and comparison of the two adjacent frames of video image data in a target candidate frame. For example: in illegal parking recognition, when a target vehicle is shielded by pedestrians or other vehicles, the same target object is determined to be two different objects before and after shielding, obviously, and finally, the phenomenon of illegal parking is wrongly recognized, so that the recognition of the illegal parking on the road based on the recognition mode in the prior art is not accurate enough, namely, the recognition precision is low, and the phenomenon of misjudgment is easy to occur.
Disclosure of Invention
In view of this, the embodiment of the present invention provides a method for identifying an occlusion target object, so as to solve the problem in the prior art that when an occlusion object occludes a target object, a video image in a target candidate frame is lost, and the prior art easily identifies the same target object into two different objects, which obviously has a certain error, and thus the identification result is not accurate enough.
According to a first aspect, an embodiment of the present invention provides an identification method for an occlusion target object, including: acquiring a target object image and an occlusion object image at least partially occluding the target object; judging whether the target object and the shielding object are the same object at the same target position or not according to the target object image and the shielding object image; and identifying the target object and the shielding object according to the judgment result.
With reference to the first aspect, in the first embodiment of the first aspect, the step of determining whether the target object and the shielding object are the same object at the same target position according to the target object image and the shielding object image further includes: constructing a twin network model; inputting the target object image and the shielding object image into the twin network model for detection, and outputting a first detection parameter corresponding to a detection result; judging whether the first detection parameter is smaller than or equal to a preset threshold value or not; when the first detection parameter is smaller than or equal to a preset threshold value, the target object and the shielding object are the same object at the same target position; and when the first detection parameter is smaller than a preset threshold value, the target object and the shielding object are not the same object at the same position.
With reference to the first aspect, in a second implementation manner of the first aspect, the step of acquiring the target object image further includes: constructing a YOLO neural network model; inputting the target object video image data into the YOLO neural network model for detection, and outputting a first target area parameter and a second target area parameter which respectively correspond to the target object image data of two adjacent frames; calculating a second detection parameter corresponding to the target object image through an IOU algorithm according to the first target area parameter and the second target area parameter; and determining the target object image in the target video image data according to the second detection parameter.
With reference to the first aspect, in a third implementation manner of the first aspect, the step of acquiring an occlusion object image at least partially occluding the target object further includes: constructing a YOLO neural network model; inputting the video image data of the shielding object into the YOLO neural network model for detection, and outputting a third target area parameter and a fourth target area parameter which respectively correspond to the image data of the shielding object of two adjacent frames; calculating a third detection parameter corresponding to the image of the shielding object by an IOU algorithm according to the third target area parameter and the fourth target area parameter; and determining the image of the shielding object in the video image data of the shielding object according to the third detection parameter.
With reference to the first aspect or any implementation manner of the first aspect, in a fourth implementation manner of the first aspect, the target object image includes a vehicle image, a person image, an animal image, a plant image, a scenery image, or a fixed object image.
With reference to the fourth implementation manner of the first aspect, in the fifth implementation manner of the first aspect, when the target object image is a vehicle image, the method is configured to detect whether an illegal parking phenomenon and the number of illegal parking of a target vehicle corresponding to the vehicle image occur at the target position.
According to a second aspect, an embodiment of the present invention provides an identification apparatus for an occlusion target object, including: the device comprises an acquisition module, a storage module and a display module, wherein the acquisition module is used for acquiring a target object image and an occlusion object image at least partially occluding a target object; the judging module is used for judging whether the target object and the shielding object are the same object at the same target position or not according to the target object image and the shielding object image; and the identification module is used for identifying the target object and the shielding object according to the judgment result.
According to a third aspect, an embodiment of the present invention provides an image recognition apparatus including: a memory, a processor and a computer program stored in the memory and executable on the processor, where the memory and the processor are communicatively connected, and the memory stores computer instructions, and the processor executes the computer instructions to perform the steps of the method for identifying an occlusion target object according to the first aspect or any embodiment of the first aspect.
According to a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, which stores computer instructions for causing a computer to execute the steps of the method for identifying an occlusion target object in the foregoing first aspect or any implementation manner of the first aspect.
The embodiment of the invention has the following advantages:
the embodiment of the invention provides an identification method, an identification device, a storage medium and an image identification module of an occlusion target object, wherein the identification method comprises the following steps: acquiring a target object image and an occlusion object image at least partially occluding the target object; judging whether the target object and the shielding object are the same object at the same target position or not according to the target object image and the shielding object image; and identifying the target object and the shielding object according to the judgment result. According to the method and the device, whether the target object image and the shielding object image are the same object at the same target position or not is determined, the target object or the shielding object at the same position can be accurately identified, the target object at the same position is prevented from being mistakenly identified into two different objects due to the interference of the shielding object outside in the process of identifying the target object, and the identification accuracy can be remarkably improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a first flowchart of a method for identifying an occluding target object in an embodiment of the present invention;
FIG. 2 is a second flowchart of a method for identifying an occlusion target object according to an embodiment of the present invention;
FIG. 3 is a third flowchart of a method for identifying an occlusion target object according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of the IOU algorithm in an embodiment of the present invention;
FIG. 5 is a fourth flowchart of a method for identifying an occlusion target object according to an embodiment of the present invention;
FIG. 6 is a block diagram of an apparatus for identifying an occlusion target object according to an embodiment of the present invention;
FIG. 7 is a block diagram of a determining module of the device for identifying an occlusion target object according to the embodiment of the present invention;
fig. 8 is a block diagram of an acquisition module of the device for identifying an occlusion target object according to the embodiment of the present invention;
fig. 9 is a schematic diagram of a hardware structure of the image recognition apparatus in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
The embodiment of the invention provides a method for identifying a sheltered target object, which is used in a specific application scene of illegal road parking, and as shown in figure 1, the method comprises the following steps:
step S1: acquiring a target object image and an occlusion object image at least partially occluding the target object; when the identification method for the blocking target object in the embodiment is applied to the illegal road parking scene, the target object image here is a vehicle image, and of course, if the identification method for the blocking target object in the embodiment is applied to other application scenes, the target object image here includes a person image, an animal image, a plant image, a scenery image, or a fixed object image.
The blocking object image of the at least partially blocking target object includes a small partial area where the blocking object blocks the target object and a full area where the blocking object blocks the target object completely. For example: when a motor vehicle is parked in a certain position, a pedestrian passes or another motor vehicle passes in front of the motor vehicle, so that the motor vehicle parked in the certain position is shielded or completely shielded by another motor vehicle or a small area of the pedestrian.
In an embodiment, as shown in fig. 2, the step S1 of detecting the target object image in the execution process may specifically include the following steps:
step S101: constructing a YOLO neural network model; the YOLO neural network model is a convolutional neural network which can predict a plurality of positions and types at one time, and can realize end-to-end target detection and identification, so that the YOLO neural network model is widely applied to image identification.
Step S102: inputting the video image data of the target object into a YOLO neural network model for detection, and outputting a first target area parameter and a second target area parameter which respectively correspond to the image data of the target object of two adjacent frames; here, the target object video image data is continuous multi-frame video data, the image capturing apparatus captures continuous 100 frames of video image data as target video image data for identifying a target object image, and the 100 frames of video image data may be stored in a cache memory (cache) of the image identifying apparatus.
Step S103: calculating a second detection parameter corresponding to the target object image through an IOU algorithm according to the first target area parameter and the second target area parameter; the first target area parameter is a coordinate detection frame parameter corresponding to the target object image data of the first frame in the target object image data of two adjacent frames, the second target area parameter is a coordinate detection frame parameter corresponding to the target object image data of the second frame in the target object image data of two adjacent frames, as shown in fig. 4, the first target area is a rectangular frame ABCD, the corresponding first target area parameters are a (x1, y1), (x2, y1), C (x2, y2), D (x1, y2), the second target area is a rectangular frame EFGH, and the corresponding second target area parameters are E (x3, y3), F (x4, y3), G (x4, y4), H (x3, y4), wherein the coordinates of the intersection point of the first target area and the second target area are J (x3, y1), I (x2, y 4).
Step S104: and determining a target object image in the target video image data according to the second detection parameter.
The IOU algorithm is based on a ratio of an intersection area and a union area of two rectangular coordinate frames corresponding to the first target area and the second target area, that is, in fig. 4, the IOU algorithm is used to calculate a ratio of an intersection area JBIH of a rectangular frame ABCD and a rectangular frame EFGH to an area of a union of the rectangular frame ABCD and the rectangular frame EFGH, and the larger the ratio is, the higher the coincidence degree of the two rectangular coordinate frames is proved to be. The IOU calculation of the first target area and the second target area specifically comprises the following steps:
a rectangle (ABCD) area corresponding to the first target region parameter (x2-x1) (y2-y 1);
the area of the rectangle (EFGH) corresponding to the second target region parameter is (x4-x3) × (y4-y 3);
overlap area (JBIH) — (x2-x3) — (y4-y 1);
YIOUarea of overlap (JBIH)/[ area of rectangle (ABCD) corresponding to first target area parameter + area of rectangle (EFGH) corresponding to second target area parameter-area of overlap (JBIH)]=(x2-x3)*(y4-y1)/[(x2-x1)*(y2-y1)+(x4-x3)*(y4-y3)-(x2-x3)*(y4-y1)]。
In a specific embodiment, the step S1 of detecting the image of the blocking object in the execution process may specifically include the following steps:
firstly, constructing a YOLO neural network model;
inputting the video image data of the shielding object into a YOLO neural network model for detection, and outputting a third target area parameter and a fourth target area parameter which respectively correspond to the image data of the shielding object of two adjacent frames;
and thirdly, calculating a third detection parameter corresponding to the image of the shielding object through an IOU algorithm according to the third target area parameter and the fourth target area parameter.
And fourthly, determining an image of the shielding object in the video image data of the shielding object according to the third detection parameter.
The specific identification process is as follows:
as shown in fig. 3.
Firstly, inputting an original video frame image into a YOLO network for object detection;
judging whether the current program is a first frame, namely whether data of a previous frame exist;
if the current frame is the first frame of the program operation, initializing a cache of a cache memory, wherein the cache is used for storing all the screenshots of the detected object, the coordinates of the detected frame and the current frame number (frame _ id) of the current frame (the first frame);
if the current frame is not the first frame of program operation, judging whether the frame has a corresponding motor vehicle detection frame by using an IOU algorithm for all motor vehicles of the previous frame in the cache of the cache memory;
if the frame is judged to have the corresponding detection frame of a certain motor vehicle in the previous frame, updating the frame number (frame _ id), the vehicle screenshot and the detection frame coordinate value of the motor vehicle in the cache of the cache;
if the frame does not have the corresponding detection frame of a certain motor vehicle in the previous frame, judging whether the current frame number (frame _ id) is greater than the frame number (frame _ id) +100 of the motor vehicle in the cache;
if the frame number (frame _ id) is larger than the frame number (frame _ id) +100 of the object stored in the cache of the cache memory according to the sixth judgment, judging that the motor vehicle has driven out of the picture, and deleting the motor vehicle from the cache of the cache memory;
eighthly, if the frame number (frame _ id) is determined to be less than or equal to the object frame number (frame _ id) +100 stored in the cache of the cache memory through six judgment, executing the idle operation
All the above steps are performed in a loop once for each of 100 frames in the target video image data.
The second detection parameter or the third detection parameter in the above step S103 is typically a number between 0 and 1. For example: specifically, the specific training process of this step is to detect a single frame of vehicle image extracted from the target video image data by the YOLO algorithm, and return a value corresponding to the coordinate detection frame, where the coordinate detection frame of each frame of vehicle image is detected and returns a corresponding value, and then perform an IOU calculation (intersection ratio calculation) on the coordinate detection frames of two adjacent frames of vehicle images, where the IOU calculation takes the coordinate detection frames of two object images as input, and returns a number between 0 and 1, where the number represents a ratio of an intersection area and an union area of the two rectangular coordinate detection frames, and the larger the ratio, the higher the ratio, the smaller the ratio, and the smaller the ratio, the smaller the ratio of the two rectangular coincidence.
Step S104 in the above: and determining a target object image in the target video image data according to the second detection parameter or determining an occlusion object image in the occlusion object video image data according to the third detection parameter. For example: the target video image data includes a target object (vehicle), an obstructing object (pedestrian, vehicle) and other building objects, and since the adjacent positions of one object are continuous, the coincidence degree of two adjacent frames of images of the object is often large, so if the second detection parameter or the third detection parameter is 0.9, which indicates that the coincidence degree of two adjacent frames of images is large, the target object image in the target video image data can be determined in the target video image.
Step S2: judging whether the target object and the shielding object are the same object at the same target position or not according to the target object image and the shielding object image; the same target position here may be regarded as the same target coordinate detection frame or the same target candidate frame in the image recognition process.
In a preferred embodiment, as shown in fig. 5, the step S2 may specifically include the following steps in the execution process:
step S201: constructing a twin network model; the twin neural network model is to input two images into the neural network model, so as to analyze the similarity of the two images.
Step S202: inputting the target object image and the shielding object image into the twin network model for detection, and outputting a first detection parameter corresponding to a detection result; in fact, two images before and after occlusion at the same position are input into the twin network model for detection, where the first detection parameter is also a numerical value between 0 and 1, and the numerical value represents a distance value (floating point number). For example: and respectively inputting the vehicle image before the shielding and the vehicle image after the shielding into the trained twin network model, and outputting a distance value, wherein the larger the distance value is, the more different the two images are, namely the target objects before and after the shielding at the same position are not the same object.
Step S203: and judging whether the first detection parameter is less than or equal to a preset threshold value. For example: the preset threshold value is 0.47, which can be regarded as a reference value. And comparing the first detection parameter with a preset threshold value, and judging whether the first detection parameter and the first detection parameter are both less than or equal to the preset threshold value.
Step S204: and when the first detection parameter is smaller than or equal to the preset threshold value, the target object and the shielding object are the same object at the same target position. For example: the preset threshold is 0.47, the first detection parameter is 0.35, and since 0.35 is less than 0.47, the target object and the shielding object are the same object at the same position, for example: when the first detection parameter is less than or equal to the preset threshold value 0.47, the vehicles before and after the shielding at the same position can be judged to be the same vehicle, the vehicle does not drive away from the position within the preset time of 100 frames of target video images, and at the moment, only one illegal parking event is reported.
Step S205: and when the first detection parameter is smaller than the preset threshold value, the target object and the shielding object are not the same object at the same position. When the shielding object shields the vehicle, and at the moment, when the first detection parameter is greater than the preset threshold value 0.47, it can be determined that the vehicles before and after shielding at the same position are two different vehicles, at the moment, after the vehicle drives away from the position within the preset time of 100 frames of target video images, another vehicle stops at the position again, and at the moment, two illegal parking events are reported. In this embodiment, it is detected whether the target object and the shielding object are the same object at the same position in all the 100 frames of video images in the preset target video image data according to the steps S201 to S205.
The step S2 may be executed in other manners, not limited to the twin network model.
Step S3: and identifying the target object and the shielding object according to the judgment result. The determination result here is two cases, one case is that the target object and the occluding object are the same object, and the second case is that the target object and the occluded object are not the same object. In the embodiment, the following two situations occur, wherein one situation is that the illegal vehicles before and after being shielded at the same position are the same vehicle, and then an illegal parking event is reported; the second situation is that if the illegal vehicles before and after being shielded at the same position are not the same vehicle, two illegal parking events are reported.
According to the embodiment of the invention, whether the target object image and the shielding object image are the same object at the same target position is determined, so that the target object or the shielding object at the same position can be accurately identified, the situation that the target object at the same position is mistakenly identified by two different objects due to the interference of the shielding object in the process of identifying the target object is avoided, and the identification accuracy can be obviously improved.
Example 2
The embodiment of the invention provides a method for identifying a sheltered target object, which is used in a specific application scene of criminal suspects and partnering detection thereof, and in a figure 1, the method comprises the following steps:
step S1: acquiring a target object image and an occlusion object image at least partially occluding the target object; when the identification method for the blocking target object in the embodiment is applied to a specific application scene of a criminal suspect, the target object image is an image of the criminal suspect, and of course, if the identification method for the blocking target object in the embodiment is applied to other application scenes, the target object image includes a vehicle image, an animal image, a plant image, a scenery image or a fixed object image.
The blocking object image of the at least partially blocking target object includes a small partial area where the blocking object blocks the target object and a full area where the blocking object blocks the target object completely. For example: when a criminal suspect stays at a certain position and passes through or is shielded by another object in front of the criminal suspect, the phenomenon that the criminal suspect is shielded occurs at the moment.
In an embodiment, as shown in fig. 2, the detecting the target object image in the step S1 in the process of execution may specifically include the following steps:
step S101: constructing a YOLO neural network model; the YOLO neural network model is a convolutional neural network which can predict a plurality of positions and types at one time, and can realize end-to-end target detection and identification;
step S102: inputting the video image data of the target object into a YOLO neural network model for detection, and outputting a first target area parameter and a second target area parameter which respectively correspond to the image data of the target object of two adjacent frames; here, the target object video image data is continuous multi-frame video data, the image capturing apparatus captures continuous 100 frames of video image data as target video image data for identifying a target object image, and the 100 frames of video image data may be stored in a cache memory (cache) of the image identifying apparatus.
Step S103: and calculating a second detection parameter corresponding to the target object image through an IOU algorithm according to the first target area parameter and the second target area parameter. The first target area parameter is a coordinate detection frame parameter corresponding to the first frame of target object image data in the adjacent two frames of target object image data, and the second target area parameter is a coordinate detection frame parameter corresponding to the second frame of target object image data in the adjacent two frames of target object image data. As shown in FIG. 4, the first target area is a rectangular frame ABCD, the corresponding first target area parameters are A (x1, y1), B (x2, y1), C (x2, y2), D (x1, y2), the corresponding second target area parameters are E (x3, y3), F (x4, y3), G (x4, y4), H (x3, y4), the coordinates of the intersection point of the first target area and the second target area are J (x3, y1), I (x2, y 4).
Step S104: determining a target object image in the target video image data according to the second detection parameter; namely, the same suspect in the two frames of image data before and after is matched to obtain an object detection frame of the suspect in the image data.
The IOU algorithm is based on a ratio of an intersection area and a union area of two rectangular coordinate frames corresponding to the first target area and the second target area, that is, in fig. 4, the IOU algorithm is used to calculate a ratio of an intersection area JBIH of a rectangular frame ABCD and a rectangular frame EFGH to an area of a union of the rectangular frame ABCD and the rectangular frame EFGH, and the larger the ratio is, the higher the coincidence degree of the two rectangular coordinate frames is proved. The specific calculation process of the IOU of the first target area and the second target area is as follows:
a rectangle (ABCD) area corresponding to the first target region parameter (x2-x1) (y2-y 1);
the area of the rectangle (EFGH) corresponding to the second target region parameter is (x4-x3) × (y4-y 3);
overlap area (JBIH) — (x2-x3) — (y4-y 1);
YIOUarea of overlap (JBIH)/[ area of rectangle (ABCD) corresponding to first target area parameter + area of rectangle (EFGH) corresponding to second target area parameter-area of overlap (JBIH)]=(x2-x3)*(y4-y1)/[(x2-x1)*(y2-y1)+(x4-x3)*(y4-y3)-(x2-x3)*(y4-y1)]。
In a specific embodiment, the step S1 of detecting the image of the blocking object in the execution process may specifically include the following steps:
firstly, constructing a YOLO neural network model;
inputting the video image data of the shielding object into a YOLO neural network model for detection, and outputting a third target area parameter and a fourth target area parameter which respectively correspond to the image data of the shielding object of two adjacent frames;
thirdly, calculating a third detection parameter corresponding to the image of the shielding object through an IOU algorithm according to the third target area parameter and the fourth target area parameter;
and fourthly, determining an image of the shielding object in the video image data of the shielding object according to the third detection parameter.
The second detection parameter or the third detection parameter in the above step S103 is typically a number between 0 and 1. For example: specifically, the specific training process of this step is to detect a single frame of suspect image extracted from the target video image data by the YOLO algorithm, and return a value corresponding to the coordinate detection frame, where the coordinate detection frame of each frame of suspect image is detected and returns a corresponding value, and then perform an IOU calculation (intersection comparison calculation) on the coordinate detection frames of two adjacent frames of suspect images, where the IOU calculation uses the coordinate detection frames of two object images as input, and returns a number between 0 and 1, where the number represents a ratio of an intersection area and a union area of the two rectangular coordinate detection frames, and the larger the ratio is, the higher the coincidence degree of the two rectangles is, the smaller the ratio is, and the smaller the coincidence degree of the two rectangles is.
Step S104 in the above: and determining a target object image in the target video image data according to the second detection parameter or determining an occlusion object image in the occlusion object video image data according to the third detection parameter. For example: the target video image data includes a target object (suspect), an obstructing object (pedestrian, vehicle) and other building objects, and since the adjacent positions of one object are continuous, the overlapping degree of two adjacent frames of images of the object is often large, so if the second detection parameter is 0.9, which indicates that the overlapping degree of two adjacent frames of images is large, the obstructing object image in the obstructing object video image data can be determined in the target video image.
Step S2: judging whether the target object and the shielding object are the same object at the same target position or not according to the target object image and the shielding object image; the same target position here may be regarded as the same target coordinate detection frame or the same target candidate frame in the image recognition process.
In a preferred embodiment, as shown in fig. 5, the step S2 may specifically include the following steps in the execution process:
step S201: constructing a twin network model; the twin neural network model is to input two images into the neural network model, so as to analyze the similarity of the two images.
Step S202: and inputting the target object image and the shielding object image into the twin network model for detection, and outputting a first detection parameter corresponding to the detection result. In fact, two images before and after occlusion at the same position are input into the twin network model for detection, where the first detection parameter is also a numerical value between 0 and 1, and the numerical value represents a distance value (floating point number). For example: and respectively inputting the suspect image before shielding and the suspect image after shielding into the trained twin network model, and outputting a distance value, wherein the larger the distance value is, the more different the two images are, namely the target objects before and after shielding at the same position are not the same suspect.
Step S203: and judging whether the first detection parameter is less than or equal to a preset threshold value. For example: the preset threshold value is 0.47, which can be regarded as a reference value. And comparing the first detection parameter with a preset threshold value, and judging whether the first detection parameter and the preset threshold value are less than or equal to the preset threshold value.
Step S204: and when the first detection parameter is smaller than or equal to the preset threshold value, the target object and the shielding object are the same object at the same target position. For example: the preset threshold is 0.47, the first detection parameter is 0.35, and since 0.35 is less than 0.47, the target object and the shielding object are the same object at the same position, for example: the suspect is shielded by other vehicles or pedestrians at a certain position, and when the first detection parameter is less than or equal to the preset threshold value of 0.47, the suspects before and after shielding at the same position can be judged to be the same suspect, and an alarm can be sent out once.
Step S205: and when the first detection parameter is smaller than the preset threshold value, the target object and the shielding object are not the same object at the same position. When the shielding object shields the suspect, and at this time, when the first detection parameter is greater than the preset threshold value 0.47, it can be determined that the suspect before and after shielding at the same position is two different persons, and at this time, the suspect has left the position within the preset time of 100 frames of target video images, and at this time, when another suspect appears at the position, it is described that the another suspect is a partner of the suspect who just driven away from the position, and then two alarms are issued.
The step S2 may be executed in other manners, not limited to the twin network model.
Step S3: and identifying the target object and the shielding object according to the judgment result. The determination result here is two cases, one case is that the target object and the occluding object are the same object, and the second case is that the target object and the occluded object are not the same object. In this embodiment, two situations occur, one is that the suspect before and after the occlusion at the same position is the same person, and an alarm is given once; in the second case, if the suspect before and after the occlusion at the same position is not the same person, the alarm is issued twice.
According to the embodiment of the invention, whether the target object and the shielding object are the same object at the same target position is determined through the target object image and the shielding object image, so that the target object or the shielding object at the same position can be accurately identified, the situation that the target object at the same position is mistakenly identified by two different objects due to the interference of the shielding object in the process of identifying the target object is avoided, and the identification accuracy can be obviously improved.
The identification method for the occluded target object in the embodiment of the invention can also be applied to other application scenes to identify the occluded target object.
Example 3
The present embodiment provides an identification apparatus for an occlusion target object, as shown in fig. 6, including:
an obtaining module 41, configured to obtain a target object image and an occlusion object image at least partially occluding the target object;
the judging module 42 is configured to judge whether the target object and the shielding object are the same object at the same target position according to the target object image and the shielding object image;
and the identification module 43 is configured to identify the target object and the shielding object according to the determination result.
Preferably, as shown in fig. 7, the determining module 42 further includes:
the first constructing submodule 421 is configured to construct a twin network model, where the twin network model takes a detection screenshot before occlusion and a detection screenshot after occlusion disappears as inputs.
The first detection submodule 422 is configured to input the target object image and the shielding object image into the twin network model for detection, and output a first detection parameter corresponding to a detection result.
A first determining submodule 423, configured to determine whether the detection parameter is smaller than or equal to a preset threshold;
preferably, as shown in fig. 8, the obtaining module 41 further includes:
a second constructing submodule 411, configured to construct a YOLO neural network model, where the YOLO neural network model is used to detect and track a target object;
the second detection submodule 412 is configured to input the target object video image data to the YOLO neural network model for detection, and output a first target area parameter and a second target area parameter corresponding to the target object image data of two adjacent frames respectively;
the first calculation submodule 413 is configured to calculate, according to the first target area parameter and the second target area parameter, a second detection parameter corresponding to the target object image through an IOU algorithm;
and a first determining sub-module 414, configured to determine a target object image or an occlusion object image in the target video image data according to the second detection parameter.
According to the device, whether the target object image and the shielding object image are the same object at the same target position or not is determined, the target object at the same position can be accurately identified or the shielding object can be accurately identified, so that the problem that the target object at the same position is mistakenly identified into two different objects due to the interference of the shielding object from the outside in the process of identifying the target object is avoided, and the identification accuracy can be obviously improved.
Example 4
The present embodiment provides an image recognition apparatus including a memory and a processor, wherein the processor is configured to read instructions stored in the memory to implement the steps of the recognition method of an occlusion target object of embodiment 1 or embodiment 2 when executing a program.
The image recognition apparatus may represent the master image recognition apparatus or the standby image recognition apparatus thereof in embodiment 1 and each slave image recognition apparatus in embodiment 2. As shown in fig. 9, the apparatus includes a memory 920, a processor 910, and a computer program stored in the memory 920 and executable on the processor 910, and the processor 910 implements the steps of the method in embodiment 1 or embodiment 2 when executing the program.
Fig. 9 is a schematic diagram of a hardware structure of an image recognition device for performing a processing method of list item operations according to an embodiment of the present invention, as shown in fig. 9, the image recognition device includes one or more processors 910 and a memory 920, where one processor 910 is taken as an example in fig. 9.
The image recognition apparatus that performs the processing method of the list item operation may further include: an input device 930 and an output device 940.
The processor 910, the memory 920, the input device 930, and the output device 940 may be connected by a bus or other means, and fig. 9 illustrates an example of a connection by a bus.
Processor 910 may be a Central Processing Unit (CPU). The Processor 910 may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, or any combination thereof.
Example 5
The present embodiment provides a storage medium comprising a memory and a processor having stored thereon computer instructions which, when executed by the processor, implement the steps of the identification method of an occlusion target object of embodiment 1 or embodiment 2. The storage medium is further stored with a target object image, a shielding object image, target video image data, a first detection parameter, a second detection parameter, and the like, wherein the storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD), a Solid State Drive (SSD), or the like; the storage medium may also comprise a combination of memories of the kind described above.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), a Random Access Memory (RAM), or the like.
It should be understood that the above examples are only for clarity of illustration and are not intended to limit the embodiments. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. And obvious variations or modifications therefrom are within the scope of the invention.