WO2021143865A1 - Positioning method and apparatus, electronic device, and computer readable storage medium - Google Patents
Positioning method and apparatus, electronic device, and computer readable storage medium Download PDFInfo
- Publication number
- WO2021143865A1 WO2021143865A1 PCT/CN2021/072210 CN2021072210W WO2021143865A1 WO 2021143865 A1 WO2021143865 A1 WO 2021143865A1 CN 2021072210 W CN2021072210 W CN 2021072210W WO 2021143865 A1 WO2021143865 A1 WO 2021143865A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- target
- feature point
- distance
- feature map
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2210/00—Indexing scheme for image generation or computer graphics
- G06T2210/12—Bounding box
Definitions
- the present disclosure relates to the fields of computer technology and image processing, and in particular to a positioning method and device, electronic equipment, and computer-readable storage media.
- Object detection or object positioning is an important basic technology in computer vision, which can be applied to scenes such as instance segmentation, object tracking, person recognition, and face recognition.
- Object detection or object positioning usually uses anchor frames. However, if the number of anchor frames used is large and the expression ability of anchor frames is weak, it will lead to defects such as a large amount of calculation for object positioning and inaccurate positioning.
- the present disclosure provides at least one positioning method and device.
- the present disclosure provides a positioning method, including:
- the positioning information of the object in the target image is determined.
- the image feature map based on the target image can determine only one anchor frame for each feature point in the image feature map, that is, the object frame corresponding to the object frame information, which reduces the number of anchor frames used in the object positioning process. , which reduces the amount of calculation and improves the efficiency of object positioning.
- the image feature map based on the target image can also determine the object type information of the object to which each feature point in the image feature map belongs, the confidence level of the object frame information, and the confidence level of the object type information, and then determine based on the determined two confidence levels
- the final confidence level of the object frame information effectively enhances the information expression ability of the object frame or object frame information. It can not only express the positioning information and object type information of the object frame corresponding to the object frame information, but also express the confidence level of the object frame information. Information, which helps to improve the accuracy of object positioning based on the object frame.
- the image feature map includes a classification feature map used to classify the object to which the feature points in the image feature map belong and a classification feature map used to classify the feature points in the image feature map.
- the positioning feature map for positioning the object.
- the second confidence level of the object frame information includes:
- the object frame information of the object to which the feature point belongs and the second confidence level of the object frame information are determined.
- the object frame information of the object to which each feature point belongs in the image feature map is determined, but also the object type of the object to which each feature point belongs in the image feature map is determined
- Information as well as the respective confidence levels of the object type information and the object frame information, improve the information expression ability of the object frame, thereby helping to improve the accuracy of object positioning based on the object frame.
- determining the object frame information of the object to which the feature point belongs includes:
- each feature point in the image feature map based on the positioning feature map, respectively determine the target distance range where the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs;
- the object frame information of the object to which the feature point belongs is determined.
- the target distance range in which the distance between the characteristic point and each boundary in the object frame of the object to which the characteristic point belongs is first determined, and then, based on the determined target distance range, the target distance between the characteristic point and each boundary is determined After the two-step processing, the accuracy of the determined target distance can be improved. After that, based on the determined precise target distance, an accurately positioned object frame can be determined for the feature point, which improves the accuracy of the determined object frame.
- determining the target distance range in which the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs includes:
- the distance range corresponding to the maximum probability value can be selected as the target distance range in which the distance between the feature point and a certain frame is located, which improves the accuracy of the determined target distance range, thereby helping to improve the determination based on the target distance range The accuracy of the distance between the characteristic point and a certain boundary.
- selecting the target distance range in which the distance between the characteristic point and the boundary is located from the multiple distance ranges includes:
- the distance range corresponding to the largest first probability value is used as the target distance range.
- selecting the target distance range in which the distance between the characteristic point and the boundary is located from the multiple distance ranges includes:
- the distance range corresponding to the maximum target probability value is taken as the target distance range in which the distance between the characteristic point and the boundary is located.
- an uncertain parameter value is also determined, and the first probability can be corrected based on the uncertain parameter value Or correction to obtain the target probability value where the distance between the feature point and a certain frame is within each distance range, which improves the accuracy of the probability value of the determined feature point and the distance between a certain frame within each distance range, so that there is It is beneficial to improve the accuracy of the target distance range determined based on the probability value.
- determining the second confidence level of the object frame information includes:
- the second confidence level of the object frame information of the object to which the feature point belongs is determined.
- determining the second confidence level of the object frame information of the object to which the characteristic point belongs includes:
- the first probability value corresponding to the distance range in which the distance between the feature point and each boundary is located can be used to determine the confidence level of the object frame information of the object to which the feature point belongs, and enhance the information expressing ability of the object frame.
- determining the object type information of the object to which the feature point belongs based on the classification feature map includes:
- the object type information of the object to which the feature point belongs is determined.
- the preset object type corresponding to the largest second probability value is selected as the object type information of the object to which the feature point belongs, which improves the accuracy of the determined object type information.
- determining the positioning information of the object in the target image based on the object frame information of the object to which each feature point belongs and the target confidence of the object frame information includes:
- Multiple target feature points are filtered from the image feature map, wherein the distance between the multiple target feature points is less than a preset threshold, and the object type information of the object to which each target feature point belongs is the same;
- the positioning information of the object in the target image is determined.
- the object frame information with the highest target confidence is selected from the feature points with the same object type information at a relatively close distance to locate the object, which can effectively reduce the amount of object frame information used for object positioning. It is helpful to improve the timeliness of object positioning.
- the present disclosure provides a positioning device, including:
- An image acquisition module for acquiring a target image, wherein the target image includes at least one object to be located;
- An image processing module for determining the object type information of the object to which each feature point belongs, the object frame information of the object to which each feature point belongs, and the object type information in the image feature map based on the image feature map of the target image The first confidence level of and the second confidence level of the object frame information;
- a confidence processing module configured to determine the target confidence of the object frame information of the object to which each feature point belongs based on the first confidence and the second confidence;
- the positioning module is configured to determine the positioning information of the object in the target image based on the object frame information of the object to which each feature point belongs and the target confidence of the object frame information.
- the image feature map includes a classification feature map used to classify the object to which the feature points in the image feature map belong and a classification feature map used to classify the feature points in the image feature map.
- the positioning feature map for positioning the object.
- the image processing module is used for:
- the object frame information of the object to which the feature point belongs and the second confidence level of the object frame information are determined.
- the image processing module determines the object frame information of the object to which the feature point belongs based on the positioning feature map, it is used to:
- each feature point in the image feature map based on the positioning feature map, respectively determine the target distance range where the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs;
- the object frame information of the object to which the feature point belongs is determined.
- the image processing module is used to determine the target distance range in which the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs:
- the image processing module selects the target distance range in which the distance between the characteristic point and the boundary is located from the plurality of distance ranges based on the determined first probability value When used for:
- the distance range corresponding to the largest first probability value is used as the target distance range.
- the image processing module selects the target distance range in which the distance between the characteristic point and the boundary is located from the plurality of distance ranges based on the determined first probability value When used for:
- the distance range corresponding to the maximum target probability value is taken as the target distance range where the distance between the characteristic point and the boundary is located.
- the image processing module determines the second confidence level of the object frame information, it is configured to:
- the second confidence level of the object frame information of the object to which the feature point belongs is determined.
- the image processing module determines the second confidence level of the object frame information of the object to which the characteristic point belongs, it is used to:
- the image processing module is used to determine the object type information of the object to which the feature point belongs based on the classification feature map for each feature point:
- the object type information of the object to which the feature point belongs is determined.
- the positioning module is used to:
- Multiple target feature points are filtered from the image feature map, wherein the distance between the multiple target feature points is less than a preset threshold, and the object type information of the object to which each target feature point belongs is the same;
- the positioning information of the object in the target image is determined.
- the present disclosure provides an electronic device including a processor, a memory, and a bus.
- the memory stores machine-readable instructions executable by the processor.
- the processor is connected to the The memories communicate through a bus, and when the machine-readable instructions are executed by the processor, the steps of the positioning method described above are executed.
- the present disclosure also provides a computer-readable storage medium with a computer program stored on the computer-readable storage medium, and the computer program executes the steps of the positioning method described above when the computer program is run by a processor.
- the apparatus, electronic equipment, and computer-readable storage medium of the present disclosure contain at least technical features that are substantially the same as or similar to the technical features of any aspect of the method or any implementation of any aspect of the present disclosure.
- the effects of the device, electronic equipment, and computer-readable storage medium please refer to the description of the effects of the content of the method, which is not repeated here.
- FIG. 1 shows a flowchart of a positioning method provided by an embodiment of the present disclosure
- Figure 2 shows a flowchart of a positioning method provided by an embodiment of the present disclosure
- FIG. 3 shows a flowchart of a positioning method provided by an embodiment of the present disclosure
- FIG. 4 shows a flowchart of a positioning method provided by an embodiment of the present disclosure
- FIG. 5 shows a flowchart of a positioning method provided by an embodiment of the present disclosure
- FIG. 6 shows a schematic structural diagram of a positioning device provided by an embodiment of the present disclosure
- Fig. 7 shows a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
- the present disclosure provides a positioning method and device, and electronic equipment , Computer-readable storage medium. Among them, based on the image feature map of the target image in the present disclosure, only one anchor frame is determined for each feature point in the image feature map, that is, the object frame corresponding to the object frame information, which reduces the number of anchor frames used in the object positioning process. The amount of calculation.
- the image feature map based on the target image can also determine the object type information of the object to which each feature point in the image feature map belongs, the confidence level of the object frame information, and the confidence level of the object type information, and then determine based on the determined two confidence levels
- the final confidence level of the object frame information effectively enhances the information expression ability of the object frame, and is beneficial to improve the accuracy of object positioning based on the object frame.
- the embodiments of the present disclosure provide a positioning method, which is applied to a terminal device for positioning an object in an image.
- the terminal device may be a camera, a mobile phone, a wearable device, a personal computer, etc., which is not limited in the embodiment of the present disclosure.
- the positioning method provided by an embodiment of the present disclosure includes steps S110 to S140.
- the target image may be an image including a target object captured during the object tracking process, or an image including a human face captured during face detection.
- the purpose of the target image is not limited in the present disclosure.
- the target image includes at least one object to be positioned.
- the objects here can be objects, people, animals, etc.
- the target image may be captured by the terminal device that executes the positioning method of this embodiment, or it may be captured by another device and transmitted to the terminal device that executes the positioning method of this embodiment.
- the method of obtaining the target image is not limited in the present disclosure.
- the target image Before performing this step, the target image needs to be processed first to obtain the image feature map of the target image.
- a convolutional neural network can be used to extract image features of the target image to obtain an image feature map.
- the image feature map of the target image is determined.
- the object type information of the object to which the feature point belongs the object frame information of the object to which the feature point belongs, the first confidence level of the object type information, and the object frame can be determined The second degree of confidence in the information.
- a convolutional neural network may be used to perform further image feature extraction on the image feature map to obtain the object type information, the object frame information, the first confidence level and the second confidence level.
- the object type information includes the object category of the object to which the feature point belongs.
- the object frame information includes the distance between the characteristic point and each boundary in the object frame corresponding to the object frame information.
- the object frame may also be referred to as an anchor frame.
- the first confidence is used to characterize the accuracy or credibility of the object type information determined based on the image feature map.
- the second confidence is used to characterize the accuracy or credibility of the object frame information determined based on the image feature map.
- S130 Based on the first confidence and the second confidence, respectively determine the target confidence of the object frame information of the object to which each feature point belongs.
- the product of the first confidence level and the second confidence level may be used as the target confidence level of the target frame information.
- the target confidence is used to comprehensively characterize the positioning accuracy and classification accuracy of the object frame corresponding to the object frame information.
- the preset weight of the first confidence, the preset weight of the second confidence, the first confidence and the second confidence can be combined to determine the target confidence.
- the disclosure does not limit the specific implementation scheme for determining the target confidence based on the first confidence and the second confidence.
- S140 Determine positioning information of the object in the target image based on the object frame information of the object to which each feature point belongs and the target confidence of the object frame information.
- the object frame information of the object to which the feature point belongs and the target confidence of the object frame information can be used as the location information of the object to which the feature point belongs in the target image. Then, based on the location information of the object to which each feature point belongs in the target image, Determine the location information of each object in the target image.
- the target confidence level of the object frame information is determined, which effectively enhances the information expression ability of the object frame or object frame information, and not only can express the object frame information corresponding to the object frame information.
- the positioning information and object type information can also express the confidence information of the object frame information, thereby helping to improve the accuracy of object positioning based on the object frame.
- the above embodiment can determine an anchor frame for each feature point in the image feature map based on the image feature map of the target image, that is, the object frame corresponding to the object frame information, which reduces the number of anchor frames used in the object positioning process. Reduce the amount of calculation and improve the efficiency of object positioning.
- the image feature map includes a classification feature map used to classify the object to which the feature points in the image feature map belong and a classification feature map used to classify the features in the image feature map.
- the positioning feature map for positioning the object to which the point belongs.
- the convolutional neural network can be used to extract the image features of the target image to obtain the initial feature map, and then use 4 3 ⁇ 3 convolutional layer pairs with input and output 256.
- the initial feature map is processed to obtain the classification feature map and the positioning feature map.
- the first confidence level of the object type information and the second confidence level of the object frame information can be implemented by using the following steps:
- the object type information of the object to which each feature point in the image feature map belongs and the first confidence level of the object type information
- determine the image feature map based on the positioning feature map The object frame information of the object to which each feature point belongs, and the second confidence level of the object frame information.
- a convolutional neural network or a convolutional layer may be used to perform image feature extraction on the classification feature map to obtain the object type information of the object to which each feature point belongs, and the first confidence level of the object type information.
- a convolutional neural network or a convolutional layer to perform image feature extraction on the positioning feature map, the object frame information of the object to which each feature point belongs and the second confidence level of the object frame information are obtained.
- the object frame information of the object to which each feature point belongs in the image feature map is determined, but also the object type of the object to which each feature point belongs in the image feature map is determined Information, and the respective confidence levels corresponding to the object type information and the object frame information, improve the information expression ability of the object frame, thereby helping to improve the accuracy of object positioning based on the object frame.
- determining the object frame information of the object to which each feature point in the image feature map belongs can be implemented by using steps S310 to S330.
- each boundary in the object frame may be a boundary of the object frame in various directions, for example, the upper boundary, the lower boundary, the left boundary, and the right boundary in the object frame.
- a convolutional neural network or a convolutional layer may be used to perform image feature extraction on the positioning feature map to determine the target distance range where the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs.
- the layered layer performs image feature extraction on the positioning feature map to determine the first probability value of the distance between the feature point and the border within each distance range; finally, based on the determined first probability value, from the multiple Among the distance ranges, select the target distance range where the distance between the characteristic point and the boundary is located. Specifically, the distance range corresponding to the largest first probability value may be used as the target distance range.
- the object frame may include, for example, an upper boundary, a lower boundary, a left boundary, and a right boundary.
- five first probability values a, b, c of the five distance ranges corresponding to the left boundary are determined.
- the distance range corresponding to the maximum probability value is selected as the target distance range where the distance between the feature point and the boundary is located, which improves the accuracy of the determined target distance range, thereby helping to improve the feature points determined based on the target distance range The accuracy of the distance to a certain boundary.
- S320 Based on the target distance range and the positioning feature map, respectively determine the target distance between the feature point and each boundary in the object frame of the object to which the feature point belongs.
- a regression network that matches the target distance range, such as a convolutional neural network, and perform image feature extraction on the location feature map to obtain the feature point and each of the object borders of the object to which the feature point belongs The target distance of the boundary.
- a convolutional neural network is further used to determine an accurate distance, which can effectively improve the accuracy of the determined distance.
- a preset or trained parameter or weight N can be used to correct the determined target distance to obtain the final target distance.
- the precise target distance between the feature point and the left boundary is determined using this step.
- the target distance is marked in Figure 2 and denoted by f.
- the determined target distance is within the determined target distance range.
- S330 Determine the object frame information of the object to which the feature point belongs based on the position information of the feature point in the image feature map and the target distance between the feature point and each boundary.
- the location information of the feature point in the image feature map and the target distance between the feature point and each boundary can be used to determine the location information of each boundary in the object frame corresponding to the object frame information in the image feature map.
- the position information of all boundaries in the object frame in the image feature map can be used as the object frame information of the object to which the feature point belongs.
- the accuracy of the determined target distance can be improved. After that, based on the determined precise target distance, an accurately positioned object frame can be determined for the feature point, which improves the accuracy of the determined object frame.
- Steps S410 to S430 are implemented.
- a convolutional neural network can be used to determine the first probability value where the distance between the feature point and a certain boundary is within each distance range, and at the same time determine the distance uncertainty parameter value of the distance between the feature point and the boundary.
- the distance uncertainty parameter value here can be used to characterize the credibility of the determined first probabilities.
- S420 Based on the distance uncertainty parameter value and each first probability value, determine the target probability value where the distance between the characteristic point and the boundary is within each distance range.
- each first probability value is corrected by using the distance uncertainty parameter value to obtain the corresponding target probability value.
- p x, n represents the target probability value of the distance between the feature point and the boundary x within the nth distance range
- N represents the number of the distance range
- ⁇ x represents the distance uncertainty parameter value corresponding to the boundary x
- s x , N represents the first probability value that the distance between the feature point and the boundary x is within the n-th distance range
- s x,m represents the first probability value that the distance between the feature point and the boundary x is within the m-th distance range.
- S430 Based on the determined target probability value, select a target distance range in which the distance between the characteristic point and the boundary is located from the multiple distance ranges.
- the distance range corresponding to the maximum target probability value can be selected as the target distance range.
- a distance uncertainty parameter value is also determined, and the first probability can be corrected or corrected based on the parameter value. Correction to obtain the target probability value where the distance between the feature point and a certain boundary is within each distance range, which improves the accuracy of the probability value of the determined feature point and a certain boundary within each distance range, which is beneficial to Improve the accuracy of the target distance range determined based on the probability value.
- the following steps can be used to determine the confidence of the corresponding object frame information, that is, the second confidence: based on the feature points in the image feature map
- the first probability value corresponding to the target distance range in which the distance of each boundary in the object frame of the object to which the feature point belongs determines the second confidence level of the object frame information of the object to which the feature point belongs.
- the average value of the first probability value corresponding to the target distance range where the distance between the feature point and all boundaries in the object frame of the object to which the feature point belongs may be used as the second confidence.
- the first probability value corresponding to the distance range in which the distance between the feature point and each boundary is located can be used to determine the confidence level of the object frame information of the object to which the feature point belongs, that is, the second confidence level, which enhances The information expression ability of the object frame.
- determining the object type information of the object to which each feature point in the image feature map belongs can be achieved by using the following steps: based on the classification feature map, determining the image feature map The object to which each feature point belongs is the second probability value of each preset object type; based on the preset object type corresponding to the largest second probability value, the object type information of the object to which the feature point belongs is determined.
- a convolutional neural network or a convolutional layer may be used to perform image feature extraction on the classification feature map to obtain the second probability value of the object to which the feature point belongs is each preset object type. Then, the preset object type corresponding to the largest second probability value is selected to determine the object type information of the object to which the feature point belongs. As shown in FIG. 2, the second probability value corresponding to the preset object type "cat" determined by this embodiment is the largest, so it is determined that the object type information corresponds to a cat. It should be noted that in this article, different operations can use different parts of the same convolutional neural network.
- the preset object type corresponding to the largest second probability value is selected as the object type information of the object to which the feature point belongs, which improves the accuracy of the determined object type information.
- steps S510 to S530 implementation based on the object frame information of the object to which each feature point belongs and the target confidence of the object frame information, to determine the location information of the object in the target image.
- the multiple target feature points obtained by screening are feature points belonging to the same object.
- S520 From the object frame information of the object to which each target feature point belongs, select the object frame information with the highest target confidence as the target frame information.
- the object frame information corresponding to the highest target confidence can be selected to locate the object, and other object frame information with lower target confidence can be eliminated to reduce the amount of calculation in the object positioning process.
- S530 Determine positioning information of the object in the target image based on the selected target frame information and the target confidence of the target frame information.
- the object frame information with the highest target confidence is selected from the object frame information corresponding to the feature points with the same object type information at a relatively close distance to locate the object, which can effectively reduce the object frame used for object positioning.
- the amount of information is conducive to improving the timeliness of object positioning.
- the embodiments of the present disclosure also provide a positioning device, which locates an object in an image on a terminal device, and the device and its various modules can perform the same method as the positioning method. Steps, and can achieve the same or similar beneficial effects, so the repeated parts will not be repeated.
- the positioning device provided by the present disclosure includes:
- the image acquisition module 610 is configured to acquire a target image, where the target image includes at least one object to be located.
- the image processing module 620 is configured to determine, based on the image feature map of the target image, the object type information of the object to which each feature point belongs, the object frame information of the object to which each feature point belongs, and the object type in the image feature map The first confidence level of the information and the second confidence level of the object frame information.
- the confidence processing module 630 is configured to determine the target confidence of the object frame information of the object to which each feature point belongs based on the first confidence and the second confidence.
- the positioning module 640 is configured to determine the positioning information of the object in the target image based on the object frame information of the object to which each feature point belongs and the target confidence of the object frame information.
- the image feature map includes a classification feature map for classifying objects to which feature points in the image feature map belong, and a classification feature map for classifying objects to which the feature points in the image feature map belong. Positioning feature map for positioning.
- the image processing module 620 is used to:
- the image processing module 620 determines the object frame information of the object to which each feature point in the image feature map belongs based on the positioning feature map, it is used to:
- each feature point in the image feature map based on the positioning feature map, respectively determine the target distance range where the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs;
- the object frame information of the object to which the feature point belongs is determined.
- the image processing module 620 is used to determine the target distance range in which the distance between a feature point and each boundary in the object frame of the object to which the feature point belongs:
- the image processing module when the image processing module selects the target distance range in which the distance between the characteristic point and the boundary is located from the plurality of distance ranges based on the determined first probability value, use At:
- the distance range corresponding to the largest first probability value is used as the target distance range.
- the image processing module 620 selects the target distance range in which the distance between the characteristic point and the boundary is located from the plurality of distance ranges based on the determined first probability value, Used for:
- the distance range corresponding to the maximum target probability value is taken as the target distance range where the distance between the characteristic point and the boundary is located.
- the image processing module 620 determines the second confidence level of the object frame information, it is configured to:
- the image processing module determines the second confidence level of the object frame information of the object to which the characteristic point belongs, it is used to:
- the image processing module 620 determines the object type information of the object to which each feature point in the image feature map belongs based on the classification feature map, it is used to:
- the object type information of the object to which the feature point belongs is determined.
- the positioning module 640 is used to:
- Multiple target feature points are filtered from the image feature map, wherein the distance between the multiple target feature points is less than a preset threshold, and the object type information of the object to which each target feature point belongs is the same;
- the positioning information of the object in the target image is determined.
- the embodiment of the present disclosure discloses an electronic device, as shown in FIG. 7, comprising: a processor 701, a memory 702, and a bus 703.
- the memory 702 stores machine-readable instructions executable by the processor 701. When the device is running, the processor 701 and the memory 702 communicate through the bus 703.
- the positioning information of the object in the target image is determined.
- the embodiment of the present disclosure also provides a computer program product corresponding to the method and device, which includes a computer-readable storage medium storing program code.
- the instructions included in the program code can be used to execute the method in the previous method embodiment. For implementation, refer to the method embodiment, which will not be repeated here.
- the modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
- the functional units in the various embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a non-volatile computer readable storage medium executable by a processor.
- the technical solution of the present disclosure essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present disclosure.
- the aforementioned storage media include: U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk and other media that can store program codes.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Medical Informatics (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
Abstract
Description
相关申请的交叉引用Cross-references to related applications
本公开要求于2020年1月18日提交的、申请号为202010058788.7、发明名称为“定位方法及装置、电子设备和计算机可读存储介质”的中国专利申请的优先权,该中国专利申请公开的全部内容以引用的方式并入本文中。This disclosure claims the priority of the Chinese patent application filed on January 18, 2020, the application number is 202010058788.7, and the invention title is "positioning method and device, electronic equipment, and computer-readable storage medium." All content is incorporated into this article by reference.
本公开涉及计算机技术、图像处理领域,具体而言,涉及一种定位方法及装置、电子设备、计算机可读存储介质。The present disclosure relates to the fields of computer technology and image processing, and in particular to a positioning method and device, electronic equipment, and computer-readable storage media.
对象检测或对象定位是计算机视觉中重要的基础技术,可应用于实例分割、对象追踪、人物识别、人脸识别等场景。Object detection or object positioning is an important basic technology in computer vision, which can be applied to scenes such as instance segmentation, object tracking, person recognition, and face recognition.
对象检测或对象定位通常会利用锚框,然而,如果使用的锚框数量多、锚框表达能力弱等,会导致对象定位计算量大、定位不准确等缺陷。Object detection or object positioning usually uses anchor frames. However, if the number of anchor frames used is large and the expression ability of anchor frames is weak, it will lead to defects such as a large amount of calculation for object positioning and inaccurate positioning.
发明内容Summary of the invention
有鉴于此,本公开至少提供一种定位方法及装置。In view of this, the present disclosure provides at least one positioning method and device.
第一方面,本公开提供了一种定位方法,包括:In the first aspect, the present disclosure provides a positioning method, including:
获取目标图像,其中所述目标图像包括至少一个待定位的对象;Acquiring a target image, where the target image includes at least one object to be positioned;
基于所述目标图像的图像特征图,确定所述图像特征图中每个特征点所属对象的对象类型信息、每个特征点所属对象的对象边框信息、所述对象类型信息的第一置信度和所述对象边框信息的第二置信度;Based on the image feature map of the target image, determine the object type information of the object to which each feature point belongs in the image feature map, the object frame information of the object to which each feature point belongs, the first confidence level of the object type information, and The second confidence level of the object frame information;
基于所述第一置信度和所述第二置信度,分别确定每个特征点所属对象的对象边框信息的目标置信度;Based on the first confidence level and the second confidence level, respectively determine the target confidence level of the object frame information of the object to which each feature point belongs;
基于每个特征点所属对象的对象边框信息和所述对象边框信息的目标置信度,确定所述目标图像中对象的定位信息。Based on the object frame information of the object to which each feature point belongs and the target confidence of the object frame information, the positioning information of the object in the target image is determined.
上述实施方式中,基于目标图像的图像特征图能够为图像特征图中的每个特征点仅确定一个锚框,即对象边框信息对应的对象边框,减少了对象定位过程中使用的锚框的数量,降低了计算量,提高了对象定位的效率。同时基于目标图像的图像特征图还能够确定图像特征图中的每个特征点所属对象的对象类型信息、对象边框信息的置信度、对象类型信息的置信度,继而基于确定的两个置信度确定对象边框信息的最终的置信度,有效增强了对象边框或对象边框信息的信息表达能力,不仅能够表达对象边框信息对应的对象边框的定位信息、对象类型信息,还能表达对象边框信息的置信度信息,从而有 利于提高基于对象边框进行对象定位的准确度。In the above embodiment, the image feature map based on the target image can determine only one anchor frame for each feature point in the image feature map, that is, the object frame corresponding to the object frame information, which reduces the number of anchor frames used in the object positioning process. , Which reduces the amount of calculation and improves the efficiency of object positioning. At the same time, the image feature map based on the target image can also determine the object type information of the object to which each feature point in the image feature map belongs, the confidence level of the object frame information, and the confidence level of the object type information, and then determine based on the determined two confidence levels The final confidence level of the object frame information effectively enhances the information expression ability of the object frame or object frame information. It can not only express the positioning information and object type information of the object frame corresponding to the object frame information, but also express the confidence level of the object frame information. Information, which helps to improve the accuracy of object positioning based on the object frame.
在一种可能的实施方式中,所述图像特征图包括用于对所述图像特征图中的特征点所属的对象进行分类的分类特征图和用于对所述图像特征图中的特征点所属的对象进行定位的定位特征图。In a possible implementation manner, the image feature map includes a classification feature map used to classify the object to which the feature points in the image feature map belong and a classification feature map used to classify the feature points in the image feature map. The positioning feature map for positioning the object.
基于所述目标图像的图像特征图,确定所述图像特征图中每个特征点所属对象的对象类型信息、每个特征点所属对象的对象边框信息、所述对象类型信息的第一置信度和所述对象边框信息的第二置信度,包括:Based on the image feature map of the target image, determine the object type information of the object to which each feature point belongs in the image feature map, the object frame information of the object to which each feature point belongs, the first confidence level of the object type information, and The second confidence level of the object frame information includes:
针对所述图像特征图中每个特征点,基于所述分类特征图,确定该特征点所属对象的对象类型信息,和所述对象类型信息的第一置信度;For each feature point in the image feature map, based on the classification feature map, determine the object type information of the object to which the feature point belongs, and the first confidence level of the object type information;
基于所述定位特征图,确定该特征点所属对象的对象边框信息,和所述对象边框信息的第二置信度。Based on the positioning feature map, the object frame information of the object to which the feature point belongs and the second confidence level of the object frame information are determined.
上述实施方式,基于目标图像的分类特征图和定位特征图,不仅确定了图像特征图中每个特征点所属对象的对象边框信息,还确定了图像特征图中每个特征点所属对象的对象类型信息,以及,对象类型信息和对象边框信息各自的置信度,提高了对象边框的信息表达能力,从而有利于提高基于对象边框进行对象定位的准确度。In the above embodiment, based on the classification feature map and the positioning feature map of the target image, not only the object frame information of the object to which each feature point belongs in the image feature map is determined, but also the object type of the object to which each feature point belongs in the image feature map is determined Information, as well as the respective confidence levels of the object type information and the object frame information, improve the information expression ability of the object frame, thereby helping to improve the accuracy of object positioning based on the object frame.
在一种可能的实施方式中,针对所述图像特征图中每个特征点,基于所述定位特征图,确定该特征点所属对象的对象边框信息,包括:In a possible implementation manner, for each feature point in the image feature map, based on the positioning feature map, determining the object frame information of the object to which the feature point belongs includes:
针对所述图像特征图中的每个特征点,基于所述定位特征图,分别确定该特征点与该特征点所属对象的对象边框中的每条边界的距离各自所位于的目标距离范围;For each feature point in the image feature map, based on the positioning feature map, respectively determine the target distance range where the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs;
基于所述目标距离范围和所述定位特征图,分别确定该特征点与该特征点所属对象的对象边框中的每条边界的目标距离;Based on the target distance range and the positioning feature map, respectively determine the target distance between the feature point and each boundary in the object frame of the object to which the feature point belongs;
基于该特征点在所述图像特征图中的位置信息,和该特征点与每条边界的目标距离,确定该特征点所属对象的对象边框信息。Based on the position information of the feature point in the image feature map and the target distance between the feature point and each boundary, the object frame information of the object to which the feature point belongs is determined.
上述实施方式,首先确定特征点与该特征点所属对象的对象边框中的每条边界的距离所位于的目标距离范围,之后,基于确定的目标距离范围,确定特征点与每条边界的目标距离,经过该两步处理能够提高确定的目标距离的准确度。之后,基于确定的该精确的目标距离,能够为特征点确定一个位置精确的对象边框,提高了确定的对象边框的准确度。In the foregoing embodiment, the target distance range in which the distance between the characteristic point and each boundary in the object frame of the object to which the characteristic point belongs is first determined, and then, based on the determined target distance range, the target distance between the characteristic point and each boundary is determined After the two-step processing, the accuracy of the determined target distance can be improved. After that, based on the determined precise target distance, an accurately positioned object frame can be determined for the feature point, which improves the accuracy of the determined object frame.
在一种可能的实施方式中,确定该特征点与该特征点所属对象的对象边框中的每条边界的距离各自所位于的目标距离范围,包括:In a possible implementation manner, determining the target distance range in which the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs includes:
针对该特征点所属对象的对象边框中的每条边界,基于所述定位特征图,确定该特征点与该条边界的最大距离;For each boundary in the object frame of the object to which the feature point belongs, determine the maximum distance between the feature point and the boundary based on the positioning feature map;
将所述最大距离进行分段处理,得到多个距离范围;Perform segmentation processing on the maximum distance to obtain multiple distance ranges;
基于所述定位特征图,确定该特征点与该条边界的距离位于每个距离范围内的第一概率值;Based on the positioning feature map, determine the first probability value at which the distance between the feature point and the boundary is within each distance range;
基于确定的所述第一概率值,从所述多个距离范围中,选取该特征点与该条边界的距离所位于的目标距离范围。Based on the determined first probability value, select the target distance range in which the distance between the characteristic point and the boundary is located from the plurality of distance ranges.
上述实施方式,可以选取最大概率值对应的距离范围作为特征点与某一边框的距离所位于的目标距离范围,提高了确定的目标距离范围的准确度,从而有利于提高基于该目标距离范围确定的特征点与某一条边界的距离的准确度。In the above embodiment, the distance range corresponding to the maximum probability value can be selected as the target distance range in which the distance between the feature point and a certain frame is located, which improves the accuracy of the determined target distance range, thereby helping to improve the determination based on the target distance range The accuracy of the distance between the characteristic point and a certain boundary.
在一种可能的实施方式中,基于确定的所述第一概率值,从所述多个距离范围中,选取该特征点与该条边界的距离所位于的目标距离范围,包括:In a possible implementation manner, based on the determined first probability value, selecting the target distance range in which the distance between the characteristic point and the boundary is located from the multiple distance ranges includes:
将最大的所述第一概率值对应的距离范围作为所述目标距离范围。The distance range corresponding to the largest first probability value is used as the target distance range.
在一种可能的实施方式中,基于确定的所述第一概率值,从所述多个距离范围中,选取该特征点与该条边界的距离所位于的目标距离范围,包括:In a possible implementation manner, based on the determined first probability value, selecting the target distance range in which the distance between the characteristic point and the boundary is located from the multiple distance ranges includes:
基于所述定位特征图,确定该特征点与该条边界的距离的距离不确定参数值;Based on the positioning feature map, determine the distance uncertainty parameter value of the distance between the feature point and the border;
基于所述距离不确定参数值和每个第一概率值,确定该特征点与该条边界的距离位于每个距离范围内的目标概率值;Based on the distance uncertainty parameter value and each first probability value, determine the target probability value where the distance between the characteristic point and the boundary is within each distance range;
将最大的所述目标概率值对应的距离范围,作为该特征点与该条边界的距离所位于的目标距离范围。The distance range corresponding to the maximum target probability value is taken as the target distance range in which the distance between the characteristic point and the boundary is located.
上述实施方式,在确定特征点与某条边框的距离位于每个距离范围内的第一概率值的同时,还确定了一个不确定参数值,基于该不确定参数值能够对第一概率进行修正或校正,得到特征点与某条边框的距离位于每个距离范围内的目标概率值,提高了确定的特征点与某条边框的距离位于每个距离范围内的概率值的准确度,从而有利于提高基于该概率值确定的目标距离范围的准确度。In the foregoing embodiment, while determining the first probability value at which the distance between the feature point and a certain border is within each distance range, an uncertain parameter value is also determined, and the first probability can be corrected based on the uncertain parameter value Or correction to obtain the target probability value where the distance between the feature point and a certain frame is within each distance range, which improves the accuracy of the probability value of the determined feature point and the distance between a certain frame within each distance range, so that there is It is beneficial to improve the accuracy of the target distance range determined based on the probability value.
在一种可能的实施方式中,确定所述对象边框信息的第二置信度,包括:In a possible implementation manner, determining the second confidence level of the object frame information includes:
基于该特征点与该特征点所属对象的对象边框中每条边界的距离各自所位于的目标距离范围对应的第一概率值,确定该特征点所属对象的对象边框信息的第二置信度。Based on the first probability value corresponding to the target distance range where the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs is determined, the second confidence level of the object frame information of the object to which the feature point belongs is determined.
在一种可能的实施方式中,确定该特征点所属对象的对象边框信息的第二置信度,包括:In a possible implementation manner, determining the second confidence level of the object frame information of the object to which the characteristic point belongs includes:
获取该特征点与该特征点所属对象的对象边框中每条边界的距离各自所位于的目标距离范围对应的第一概率值的均值;Acquiring the average value of the first probability value corresponding to the target distance range in which the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs;
确定该均值作为所述第二置信度。Determine the mean value as the second confidence level.
上述实施方式,利用特征点与每条边界的距离所位于的距离范围对应的的第一概率值,能够确定该特征点所属对象的对象边框信息的置信度,增强了对象边框的信息表达能力。In the foregoing embodiment, the first probability value corresponding to the distance range in which the distance between the feature point and each boundary is located can be used to determine the confidence level of the object frame information of the object to which the feature point belongs, and enhance the information expressing ability of the object frame.
在一种可能的实施方式中,针对所述图像特征图中每个特征点,基于所述分类特征图,确定该特征点所属对象的对象类型信息,包括:In a possible implementation manner, for each feature point in the image feature map, determining the object type information of the object to which the feature point belongs based on the classification feature map includes:
针对所述图像特征图中每个特征点,基于所述分类特征图,确定该特征点所属的对象为每种预设对象类型的第二概率值;For each feature point in the image feature map, based on the classification feature map, determine that the object to which the feature point belongs is the second probability value of each preset object type;
基于最大的所述第二概率值对应的预设对象类型,确定该特征点所属对象的对象类型信息。Based on the preset object type corresponding to the largest second probability value, the object type information of the object to which the feature point belongs is determined.
上述实施方式,选取最大第二概率值对应的预设对象类型作为该特征点所属对象的对象类型信息,提高了确定的对象类型信息的准确度。In the foregoing embodiment, the preset object type corresponding to the largest second probability value is selected as the object type information of the object to which the feature point belongs, which improves the accuracy of the determined object type information.
在一种可能的实施方式中,基于每个特征点所属对象的对象边框信息和所述对象边框信息的目标置信度,确定所述目标图像中对象的定位信息,包括:In a possible implementation manner, determining the positioning information of the object in the target image based on the object frame information of the object to which each feature point belongs and the target confidence of the object frame information includes:
从所述图像特征图中筛选出多个目标特征点,其中,所述多个目标特征点彼此之间的距离小于预设阈值,并且各个目标特征点所属对象的对象类型信息相同;Multiple target feature points are filtered from the image feature map, wherein the distance between the multiple target feature points is less than a preset threshold, and the object type information of the object to which each target feature point belongs is the same;
从每个目标特征点所属对象的对象边框信息中,选取具有最高目标置信度的对象边框信息,作为目标边框信息;From the object frame information of the object to which each target feature point belongs, select the object frame information with the highest target confidence as the target frame information;
基于选取的所述目标边框信息,以及所述目标边框信息的目标置信度,确定所述目标图像中对象的定位信息。Based on the selected target frame information and the target confidence of the target frame information, the positioning information of the object in the target image is determined.
上述实施方式,从距离比较近的、对象类型信息相同的特征点中选取目标置信度最高的对象边框信息,来进行对象的定位,能够有效减少用于进行对象定位的对象边框信息的数量,有利于提高对象定位的时效性。In the above embodiment, the object frame information with the highest target confidence is selected from the feature points with the same object type information at a relatively close distance to locate the object, which can effectively reduce the amount of object frame information used for object positioning. It is helpful to improve the timeliness of object positioning.
第二方面,本公开提供了一种定位装置,包括:In a second aspect, the present disclosure provides a positioning device, including:
图像获取模块,用于获取目标图像,其中所述目标图像包括至少一个待定位的对象;An image acquisition module for acquiring a target image, wherein the target image includes at least one object to be located;
图像处理模块,用于基于所述目标图像的图像特征图,确定所述图像特征图中每个特征点所属对象的对象类型信息、每个特征点所属对象的对象边框信息、所述对象类型信息的第一置信度和所述对象边框信息的第二置信度;An image processing module for determining the object type information of the object to which each feature point belongs, the object frame information of the object to which each feature point belongs, and the object type information in the image feature map based on the image feature map of the target image The first confidence level of and the second confidence level of the object frame information;
置信度处理模块,用于基于所述第一置信度和所述第二置信度,分别确定每个特征点所属对象的对象边框信息的目标置信度;A confidence processing module, configured to determine the target confidence of the object frame information of the object to which each feature point belongs based on the first confidence and the second confidence;
定位模块,用于基于每个特征点所属对象的对象边框信息和所述对象边框信息的目标置信度,确定所述目标图像中对象的定位信息。The positioning module is configured to determine the positioning information of the object in the target image based on the object frame information of the object to which each feature point belongs and the target confidence of the object frame information.
在一种可能的实施方式中,所述图像特征图包括用于对所述图像特征图中的特征点所属的对象进行分类的分类特征图和用于对所述图像特征图中的特征点所属的对象进行定位的定位特征图。In a possible implementation manner, the image feature map includes a classification feature map used to classify the object to which the feature points in the image feature map belong and a classification feature map used to classify the feature points in the image feature map. The positioning feature map for positioning the object.
所述图像处理模块用于:The image processing module is used for:
针对所述图像特征图中每个特征点,基于所述分类特征图,确定该特征点所属对象的对象类型信息,和所述对象类型信息的第一置信度;For each feature point in the image feature map, based on the classification feature map, determine the object type information of the object to which the feature point belongs, and the first confidence level of the object type information;
基于所述定位特征图,确定该特征点所属对象的对象边框信息,和所述对象边框信息的第二置信度。Based on the positioning feature map, the object frame information of the object to which the feature point belongs and the second confidence level of the object frame information are determined.
在一种可能的实施方式中,所述图像处理模块在针对所述图像特征图中每个特征点,基于所述定位特征图,确定该特征点所属对象的对象边框信息时,用于:In a possible implementation manner, for each feature point in the image feature map, when the image processing module determines the object frame information of the object to which the feature point belongs based on the positioning feature map, it is used to:
针对所述图像特征图中的每个特征点,基于所述定位特征图,分别确定该特征点与该特征点所属对象的对象边框中的每条边界的距离各自所位于的目标距离范围;For each feature point in the image feature map, based on the positioning feature map, respectively determine the target distance range where the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs;
基于所述目标距离范围和所述定位特征图,分别确定该特征点与该特征点所属对象的对象边框中的每条边界的目标距离;Based on the target distance range and the positioning feature map, respectively determine the target distance between the feature point and each boundary in the object frame of the object to which the feature point belongs;
基于该特征点在所述图像特征图中的位置信息,和该特征点与每条边界的目标距离,确定该特征点所属对象的对象边框信息。Based on the position information of the feature point in the image feature map and the target distance between the feature point and each boundary, the object frame information of the object to which the feature point belongs is determined.
在一种可能的实施方式中,所述图像处理模块在确定该特征点与该特征点所属对象的对象边框中的每条边界的距离各自所位于的目标距离范围时,用于:In a possible implementation, the image processing module is used to determine the target distance range in which the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs:
针对该特征点所属对象的对象边框中的每条边界,基于所述定位特征图,确定该特征点与该条边界的最大距离;For each boundary in the object frame of the object to which the feature point belongs, determine the maximum distance between the feature point and the boundary based on the positioning feature map;
将所述最大距离进行分段处理,得到多个距离范围;Perform segmentation processing on the maximum distance to obtain multiple distance ranges;
基于所述定位特征图,确定该特征点与该条边界的距离位于每个距离范围内的第一概率值;Based on the positioning feature map, determine the first probability value at which the distance between the feature point and the boundary is within each distance range;
基于确定的所述第一概率值,从所述多个距离范围中,选取该特征点与该条边界的距离所位于的目标距离范围。Based on the determined first probability value, select the target distance range in which the distance between the characteristic point and the boundary is located from the plurality of distance ranges.
在一种可能的实施方式中,所述图像处理模块在基于确定的所述第一概率值,从所述多个距离范围中,选取该特征点与该条边界的距离所位于的目标距离范围时,用于:In a possible implementation manner, the image processing module selects the target distance range in which the distance between the characteristic point and the boundary is located from the plurality of distance ranges based on the determined first probability value When used for:
将最大的所述第一概率值对应的距离范围作为所述目标距离范围。The distance range corresponding to the largest first probability value is used as the target distance range.
在一种可能的实施方式中,所述图像处理模块在基于确定的所述第一概率值,从所述多个距离范围中,选取该特征点与该条边界的距离所位于的目标距离范围时,用于:In a possible implementation manner, the image processing module selects the target distance range in which the distance between the characteristic point and the boundary is located from the plurality of distance ranges based on the determined first probability value When used for:
基于所述定位特征图,确定该特征点与该条边界的距离的距离不确定参数值;Based on the positioning feature map, determine the distance uncertainty parameter value of the distance between the feature point and the border;
基于所述距离不确定参数值和每个第一概率值,确定该特征点与该条边界的距离位于每个距离范围内的目标概率值;Based on the distance uncertainty parameter value and each first probability value, determine the target probability value where the distance between the characteristic point and the boundary is within each distance range;
将最大的目标概率值对应的距离范围作为该特征点与该条边界的距离所位于的目标距离范围。The distance range corresponding to the maximum target probability value is taken as the target distance range where the distance between the characteristic point and the boundary is located.
在一种可能的实施方式中,所述图像处理模块在确定所述对象边框信息的第二置信度时,用于:In a possible implementation manner, when the image processing module determines the second confidence level of the object frame information, it is configured to:
基于该特征点与该特征点所属对象的对象边框中每条边界的距离各自所位于的目标距离范围对应的第一概率值,确定该特征点所属对象的对象边框信息的第二置信度。Based on the first probability value corresponding to the target distance range where the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs is determined, the second confidence level of the object frame information of the object to which the feature point belongs is determined.
在一种可能的实施方式中,所述图像处理模块在确定该特征点所属对象的对象边框信息的第二置信度时,用于:In a possible implementation manner, when the image processing module determines the second confidence level of the object frame information of the object to which the characteristic point belongs, it is used to:
获取该特征点与该特征点所属对象的对象边框中每条边界的距离各自所位于的目标距离范围对应的第一概率值的均值;Acquiring the average value of the first probability value corresponding to the target distance range in which the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs;
确定该均值作为所述第二置信度。Determine the mean value as the second confidence level.
在一种可能的实施方式中,所述图像处理模块在针对所述图像特征图中每个特征点,基于所述分类特征图,确定该特征点所属对象的对象类型信息时,用于:In a possible implementation manner, for each feature point in the image feature map, the image processing module is used to determine the object type information of the object to which the feature point belongs based on the classification feature map for each feature point:
针对所述图像特征图中每个特征点,基于所述分类特征图,确定该特征点所属的对象为每种预设对象类型的第二概率值;For each feature point in the image feature map, based on the classification feature map, determine that the object to which the feature point belongs is the second probability value of each preset object type;
基于最大的所述第二概率值对应的预设对象类型,确定该特征点所属对象的对象类型信息。Based on the preset object type corresponding to the largest second probability value, the object type information of the object to which the feature point belongs is determined.
在一种可能的实施方式中,所述定位模块用于:In a possible implementation manner, the positioning module is used to:
从所述图像特征图中筛选出多个目标特征点,其中,所述多个目标特征点彼此之间的距离小于预设阈值,并且各个目标特征点所属对象的对象类型信息相同;Multiple target feature points are filtered from the image feature map, wherein the distance between the multiple target feature points is less than a preset threshold, and the object type information of the object to which each target feature point belongs is the same;
从每个目标特征点所属对象的对象边框信息中,选取具有最高目标置信度的对象边框信息,作为目标边框信息;From the object frame information of the object to which each target feature point belongs, select the object frame information with the highest target confidence as the target frame information;
基于选取的所述目标边框信息,以及所述目标边框信息的目标置信度,确定所述目标图像中对象的定位信息。Based on the selected target frame information and the target confidence of the target frame information, the positioning information of the object in the target image is determined.
第三方面,本公开提供了一种电子设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当电子设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如上所述定位方法的步骤。In a third aspect, the present disclosure provides an electronic device including a processor, a memory, and a bus. The memory stores machine-readable instructions executable by the processor. When the electronic device is running, the processor is connected to the The memories communicate through a bus, and when the machine-readable instructions are executed by the processor, the steps of the positioning method described above are executed.
第四方面,本公开还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如上所述定位方法的步骤。In a fourth aspect, the present disclosure also provides a computer-readable storage medium with a computer program stored on the computer-readable storage medium, and the computer program executes the steps of the positioning method described above when the computer program is run by a processor.
本公开所述装置、电子设备、和计算机可读存储介质,至少包含与本公开所述方法的任一方面或任一方面的任一实施方式的技术特征实质相同或相似的技术特征,因此关于所述装置、电子设备、和计算机可读存储介质的效果描述,可以参见所述方法内容的效果描述,这里不再赘述。The apparatus, electronic equipment, and computer-readable storage medium of the present disclosure contain at least technical features that are substantially the same as or similar to the technical features of any aspect of the method or any implementation of any aspect of the present disclosure. For the description of the effects of the device, electronic equipment, and computer-readable storage medium, please refer to the description of the effects of the content of the method, which is not repeated here.
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,应当理解,以下附图仅示出了本公开的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。In order to explain the technical solutions of the embodiments of the present disclosure more clearly, the following will briefly introduce the drawings that need to be used in the embodiments. It should be understood that the following drawings only show certain embodiments of the present disclosure, and therefore do not It should be regarded as a limitation of the scope. For those of ordinary skill in the art, other related drawings can be obtained based on these drawings without creative work.
图1示出了本公开实施例提供的一种定位方法的流程图;FIG. 1 shows a flowchart of a positioning method provided by an embodiment of the present disclosure;
图2示出了本公开实施例提供的一种定位方法的流程图;Figure 2 shows a flowchart of a positioning method provided by an embodiment of the present disclosure;
图3示出了本公开实施例提供的一种定位方法的流程图;FIG. 3 shows a flowchart of a positioning method provided by an embodiment of the present disclosure;
图4示出了本公开实施例提供的一种定位方法的流程图;FIG. 4 shows a flowchart of a positioning method provided by an embodiment of the present disclosure;
图5示出了本公开实施例提供的一种定位方法的流程图;FIG. 5 shows a flowchart of a positioning method provided by an embodiment of the present disclosure;
图6示出了本公开实施例提供的一种定位装置的结构示意图;FIG. 6 shows a schematic structural diagram of a positioning device provided by an embodiment of the present disclosure;
图7示出了本公开实施例提供的一种电子设备的结构示意图。Fig. 7 shows a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
为使本公开实施例的目的和优点更加清楚,下面将结合本公开实施例中的附图,对本公开实施例进行清楚地描述,应当理解,本公开中附图仅起到说明和描述的目的,并不用于限定本公开的保护范围。另外,应当理解,示意性的附图并未按实物比例绘制。本公开中使用的流程图示出了根据本公开的一些实施例实现的操作。应该理解,流程图的操作可以不按顺序实现,没有逻辑的上下文关系的步骤可以反转顺序或者同时实施。此外,本领域技术人员在本公开内容的指引下,可以向流程图添加一个或多个其他操作,也可以从流程图中移除一个或多个操作。In order to make the purpose and advantages of the embodiments of the present disclosure clearer, the following will clearly describe the embodiments of the present disclosure in conjunction with the accompanying drawings in the embodiments of the present disclosure. It should be understood that the drawings in the present disclosure are only for the purpose of illustration and description. It is not used to limit the protection scope of the present disclosure. In addition, it should be understood that the schematic drawings are not drawn to scale. The flowchart used in the present disclosure shows operations implemented according to some embodiments of the present disclosure. It should be understood that the operations of the flowchart may be implemented out of order, and steps without logical context may be reversed in order or implemented at the same time. In addition, under the guidance of the present disclosure, those skilled in the art can add one or more other operations to the flowchart, or remove one or more operations from the flowchart.
另外,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对参照附图提供的本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是仅仅表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。In addition, the described embodiments are only a part of the embodiments of the present disclosure, rather than all the embodiments. The components of the embodiments of the present disclosure generally described and illustrated in the drawings herein may be arranged and designed in various different configurations. Therefore, the following detailed description of the embodiments of the present disclosure provided with reference to the accompanying drawings is not intended to limit the scope of the claimed present disclosure, but merely represents selected embodiments of the present disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without creative work shall fall within the protection scope of the present disclosure.
需要说明的是,本公开实施例中将会用到术语“包括”,用于指出其后所声明的特征的存在,但并不排除增加其它的特征。It should be noted that the term "including" will be used in the embodiments of the present disclosure to indicate the existence of the features declared thereafter, but it does not exclude the addition of other features.
针对在利用锚框进行对象定位过程中,如何减少定位所用的锚框的数量,提高锚框的信息表达能力,以提高对象定位的准确度,本公开提供了一种定位方法及装置、电子设备、计算机可读存储介质。其中,本公开基于目标图像的图像特征图为图像特征图中的每个特征点仅确定一个锚框,即对象边框信息对应的对象边框,减少了对象定位过程中使用的锚框的数量,降低了计算量。同时基于目标图像的图像特征图还能够确定图像特征图中的每个特征点所属对象的对象类型信息、对象边框信息的置信度、对象类型信息的置信度,继而基于确定的两个置信度确定对象边框信息的最终的置信度,有效增强了对象边框的信息表达能力,有利于提高基于对象边框进行对象定位的准确度。Aiming at how to reduce the number of anchor frames used for positioning and improve the information expression ability of the anchor frames in the process of object positioning by using anchor frames, so as to improve the accuracy of object positioning, the present disclosure provides a positioning method and device, and electronic equipment , Computer-readable storage medium. Among them, based on the image feature map of the target image in the present disclosure, only one anchor frame is determined for each feature point in the image feature map, that is, the object frame corresponding to the object frame information, which reduces the number of anchor frames used in the object positioning process. The amount of calculation. At the same time, the image feature map based on the target image can also determine the object type information of the object to which each feature point in the image feature map belongs, the confidence level of the object frame information, and the confidence level of the object type information, and then determine based on the determined two confidence levels The final confidence level of the object frame information effectively enhances the information expression ability of the object frame, and is beneficial to improve the accuracy of object positioning based on the object frame.
下面通过具体的实施例对本公开的定位方法及装置、电子设备、计算机可读存储介质进行说明。The positioning method and device, electronic equipment, and computer-readable storage medium of the present disclosure will be described below through specific embodiments.
本公开实施例提供了一种定位方法,该方法应用于对图像中的对象进行定位的终端设备。该终端设备可以是摄像机、手机、可穿戴设备、个人电脑等,在本公开实施例中并不进行限制。具体地,如图1所示,本公开实施例提供的定位方法包括步骤S110至S140。The embodiments of the present disclosure provide a positioning method, which is applied to a terminal device for positioning an object in an image. The terminal device may be a camera, a mobile phone, a wearable device, a personal computer, etc., which is not limited in the embodiment of the present disclosure. Specifically, as shown in FIG. 1, the positioning method provided by an embodiment of the present disclosure includes steps S110 to S140.
S110、获取目标图像。S110. Obtain a target image.
这里,目标图像可以是在对象追踪过程中拍摄的包括目标对象的图像,也可以是在人脸检测中拍摄的包括人脸的图像,本公开对目标图像的用途并不进行限定。Here, the target image may be an image including a target object captured during the object tracking process, or an image including a human face captured during face detection. The purpose of the target image is not limited in the present disclosure.
目标图像中包括至少一个待定位的对象。这里的对象可以是物体,也可以是人、动物等。The target image includes at least one object to be positioned. The objects here can be objects, people, animals, etc.
目标图像可以由执行本实施例的定位方法的终端设备拍摄,也可以由其他设备 拍摄后,传输给执行本实施例的定位方法的终端设备,本公开对目标图像的获得方式并不进行限定。The target image may be captured by the terminal device that executes the positioning method of this embodiment, or it may be captured by another device and transmitted to the terminal device that executes the positioning method of this embodiment. The method of obtaining the target image is not limited in the present disclosure.
S120、基于所述目标图像的图像特征图,确定所述图像特征图中每个特征点所属对象的对象类型信息、每个特征点所属对象的对象边框信息、所述对象类型信息的第一置信度和所述对象边框信息的第二置信度。S120. Based on the image feature map of the target image, determine the object type information of the object to which each feature point belongs, the object frame information of the object to which each feature point belongs, and the first confidence of the object type information in the image feature map Degree and the second confidence degree of the object frame information.
在执行此步骤之前,首先需要对目标图像进行处理,得到目标图像的图像特征图。在具体实施时,可以利用卷积神经网络对目标图像进行图像特征提取,得到图像特征图。Before performing this step, the target image needs to be processed first to obtain the image feature map of the target image. In specific implementation, a convolutional neural network can be used to extract image features of the target image to obtain an image feature map.
在确定了目标图像的图像特征图之后,对图像特征图进行处理。由此,能够针对图像特征图中每个特征点,确定该特征点所属对象的对象类型信息、该特征点所属对象的对象边框信息、所述对象类型信息的第一置信度和所述对象边框信息的第二置信度。在具体实施时,可以利用卷积神经网络对图像特征图进行进一步地图像特征提取,来得到所述对象类型信息、对象边框信息、第一置信度和第二置信度。After the image feature map of the target image is determined, the image feature map is processed. Thus, for each feature point in the image feature map, the object type information of the object to which the feature point belongs, the object frame information of the object to which the feature point belongs, the first confidence level of the object type information, and the object frame can be determined The second degree of confidence in the information. In a specific implementation, a convolutional neural network may be used to perform further image feature extraction on the image feature map to obtain the object type information, the object frame information, the first confidence level and the second confidence level.
所述对象类型信息包括特征点所属的对象的对象类别。所述对象边框信息包括特征点与该对象边框信息对应的对象边框中每条边界的距离。其中,所述对象边框也可以称为锚框。The object type information includes the object category of the object to which the feature point belongs. The object frame information includes the distance between the characteristic point and each boundary in the object frame corresponding to the object frame information. Wherein, the object frame may also be referred to as an anchor frame.
所述第一置信度用于表征基于图像特征图确定的对象类型信息的准确度或可信度。所述第二置信度用于表征基于图像特征图确定的对象边框信息的准确度或可信度。The first confidence is used to characterize the accuracy or credibility of the object type information determined based on the image feature map. The second confidence is used to characterize the accuracy or credibility of the object frame information determined based on the image feature map.
S130、基于所述第一置信度和所述第二置信度,分别确定每个特征点所属对象的对象边框信息的目标置信度。S130: Based on the first confidence and the second confidence, respectively determine the target confidence of the object frame information of the object to which each feature point belongs.
这里,可以将第一置信度和第二置信度的乘积作为对象边框信息的目标置信度。该目标置信度用于综合表征对象边框信息对应的对象边框的定位准确度和分类准确度。Here, the product of the first confidence level and the second confidence level may be used as the target confidence level of the target frame information. The target confidence is used to comprehensively characterize the positioning accuracy and classification accuracy of the object frame corresponding to the object frame information.
当然,还可以利用其他方法确定目标置信度,例如,可以结合第一置信度的预设权重、第二置信度的预设权重、第一置信度和第二置信度来确定目标置信度,本公开对基于第一置信度和第二置信度确定目标置信度的具体的实现方案并不进行限定。Of course, other methods can also be used to determine the target confidence. For example, the preset weight of the first confidence, the preset weight of the second confidence, the first confidence and the second confidence can be combined to determine the target confidence. The disclosure does not limit the specific implementation scheme for determining the target confidence based on the first confidence and the second confidence.
S140、基于每个特征点所属对象的对象边框信息和所述对象边框信息的目标置信度,确定所述目标图像中对象的定位信息。S140: Determine positioning information of the object in the target image based on the object frame information of the object to which each feature point belongs and the target confidence of the object frame information.
这里,可以将特征点所属对象的对象边框信息和对象边框信息的目标置信度作为特征点所属对象在目标图像中的定位信息,之后,基于每个特征点所属对象在目标图像中的定位信息,确定目标图像中每个对象的定位信息。Here, the object frame information of the object to which the feature point belongs and the target confidence of the object frame information can be used as the location information of the object to which the feature point belongs in the target image. Then, based on the location information of the object to which each feature point belongs in the target image, Determine the location information of each object in the target image.
这里,不仅确定了特征点所属对象的对象边框信息,还确定了对象边框信息的目标置信度,有效增强了对象边框或对象边框信息的信息表达能力,不仅能够表达对象边框信息对应的对象边框的定位信息、对象类型信息,还能表达对象边框信息的置信度信息,从而有利于提高基于对象边框进行对象定位的准确度。Here, not only the object frame information of the object to which the feature point belongs is determined, but also the target confidence level of the object frame information is determined, which effectively enhances the information expression ability of the object frame or object frame information, and not only can express the object frame information corresponding to the object frame information. The positioning information and object type information can also express the confidence information of the object frame information, thereby helping to improve the accuracy of object positioning based on the object frame.
另外,上述实施例基于目标图像的图像特征图能够为图像特征图中的每个特征点确定一个锚框,即对象边框信息对应的对象边框,减少了对象定位过程中使用的锚框的数量,降低了计算量,提高了对象定位的效率。In addition, the above embodiment can determine an anchor frame for each feature point in the image feature map based on the image feature map of the target image, that is, the object frame corresponding to the object frame information, which reduces the number of anchor frames used in the object positioning process. Reduce the amount of calculation and improve the efficiency of object positioning.
在一些实例中,如图2所示,所述图像特征图包括用于对所述图像特征图中的特征点所属的对象进行分类的分类特征图和用于对所述图像特征图中的特征点所属的对象进行定位的定位特征图。In some examples, as shown in FIG. 2, the image feature map includes a classification feature map used to classify the object to which the feature points in the image feature map belong and a classification feature map used to classify the features in the image feature map. The positioning feature map for positioning the object to which the point belongs.
在具体实施时,如图2所示,可以利用卷积神经网络对目标图像进行图像特征提取,得到初始的特征图,之后分别利用4个3×3、输入输出都是256的卷积层对初始的特征图进行处理,得到所述分类特征图和定位特征图。In the specific implementation, as shown in Figure 2, the convolutional neural network can be used to extract the image features of the target image to obtain the initial feature map, and then use 4 3×3 convolutional layer pairs with input and output 256. The initial feature map is processed to obtain the classification feature map and the positioning feature map.
在得到分类特征图和定位特征图之后,基于所述目标图像的图像特征图,确定所述图像特征图中每个特征点所属对象的对象类型信息、每个特征点所属对象的对象边框信息、所述对象类型信息的第一置信度和所述对象边框信息的第二置信度,可以利用如下步骤实现:After obtaining the classification feature map and the positioning feature map, based on the image feature map of the target image, determine the object type information of the object to which each feature point belongs in the image feature map, the object frame information of the object to which each feature point belongs, and The first confidence level of the object type information and the second confidence level of the object frame information can be implemented by using the following steps:
基于所述分类特征图,确定所述图像特征图中每个特征点所属对象的对象类型信息,和所述对象类型信息的第一置信度;基于所述定位特征图,确定所述图像特征图中每个特征点所属对象的对象边框信息,和所述对象边框信息的第二置信度。Based on the classification feature map, determine the object type information of the object to which each feature point in the image feature map belongs, and the first confidence level of the object type information; determine the image feature map based on the positioning feature map The object frame information of the object to which each feature point belongs, and the second confidence level of the object frame information.
在具体实施时,可以利用卷积神经网络或卷积层对分类特征图进行图像特征提取,得到每个特征点所属对象的对象类型信息,和所述对象类型信息的第一置信度。利用卷积神经网络或卷积层对定位特征图进行图像特征提取,得到每个特征点所属对象的对象边框信息,和所述对象边框信息的第二置信度。In a specific implementation, a convolutional neural network or a convolutional layer may be used to perform image feature extraction on the classification feature map to obtain the object type information of the object to which each feature point belongs, and the first confidence level of the object type information. Using a convolutional neural network or a convolutional layer to perform image feature extraction on the positioning feature map, the object frame information of the object to which each feature point belongs and the second confidence level of the object frame information are obtained.
上述实施例,基于目标图像的分类特征图和定位特征图,不仅确定了图像特征图中每个特征点所属对象的对象边框信息,还确定了图像特征图中每个特征点所属对象的对象类型信息,以及,对象类型信息和对象边框信息分别对应的置信度,提高了对象边框的信息表达能力,从而有利于提高基于对象边框进行对象定位的准确度。In the above embodiment, based on the classification feature map and the positioning feature map of the target image, not only the object frame information of the object to which each feature point belongs in the image feature map is determined, but also the object type of the object to which each feature point belongs in the image feature map is determined Information, and the respective confidence levels corresponding to the object type information and the object frame information, improve the information expression ability of the object frame, thereby helping to improve the accuracy of object positioning based on the object frame.
在一些实施例中,如图3所示,基于所述定位特征图,确定所述图像特征图中每个特征点所属对象的对象边框信息,可以利用步骤S310至S330实现。In some embodiments, as shown in FIG. 3, based on the positioning feature map, determining the object frame information of the object to which each feature point in the image feature map belongs can be implemented by using steps S310 to S330.
S310、针对所述图像特征图中的每个特征点,基于所述定位特征图,分别确定该特征点与该特征点所属对象的对象边框中的每条边界的距离所位于的目标距离范围,其中,对象边框中的每条边界可以为对象边框在各个方向上的边界,例如,对象边框中的上边界、下边界、左边界和右边界。S310: For each feature point in the image feature map, based on the positioning feature map, respectively determine the target distance range in which the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs is located. Wherein, each boundary in the object frame may be a boundary of the object frame in various directions, for example, the upper boundary, the lower boundary, the left boundary, and the right boundary in the object frame.
这里,可以利用卷积神经网络或卷积层对定位特征图进行图像特征提取,以确定特征点与该特征点所属对象的对象边框中的每条边界的距离所位于的目标距离范围。Here, a convolutional neural network or a convolutional layer may be used to perform image feature extraction on the positioning feature map to determine the target distance range where the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs.
在具体实施时,可以首先基于定位特征图,确定该特征点与某条边界的最大距离;之后,将所述最大距离进行分段处理,得到多个距离范围;并利用卷积神经网络或卷积层对定位特征图进行图像特征提取,以确定该特征点与该条边界的距离位于每个距离范围内的第一概率值;最后,基于确定的所述第一概率值,从所述多个距离范围中,选取该特征点与该条边界的距离所位于的目标距离范围。具体地,可以将最大的第一概 率值对应的距离范围作为所述目标距离范围。In specific implementation, you can first determine the maximum distance between the feature point and a certain boundary based on the positioning feature map; then, perform segmentation processing on the maximum distance to obtain multiple distance ranges; and use convolutional neural networks or convolutional neural networks. The layered layer performs image feature extraction on the positioning feature map to determine the first probability value of the distance between the feature point and the border within each distance range; finally, based on the determined first probability value, from the multiple Among the distance ranges, select the target distance range where the distance between the characteristic point and the boundary is located. Specifically, the distance range corresponding to the largest first probability value may be used as the target distance range.
如图2所示,对象边框可以包括例如上边界、下边界、左边界和右边界,基于所述方法确定了左边界对应的五个距离范围的五个第一概率值a,b,c,d,e,并选取最大的第一概率值b对应的距离范围作为目标距离范围。As shown in FIG. 2, the object frame may include, for example, an upper boundary, a lower boundary, a left boundary, and a right boundary. Based on the method, five first probability values a, b, c of the five distance ranges corresponding to the left boundary are determined. d, e, and select the distance range corresponding to the largest first probability value b as the target distance range.
上述,选取最大概率值对应的距离范围作为特征点与该条边界的距离所位于的目标距离范围,提高了确定的目标距离范围的准确度,从而有利于提高基于该目标距离范围确定的特征点与某一条边界的距离的准确度。As mentioned above, the distance range corresponding to the maximum probability value is selected as the target distance range where the distance between the feature point and the boundary is located, which improves the accuracy of the determined target distance range, thereby helping to improve the feature points determined based on the target distance range The accuracy of the distance to a certain boundary.
S320、基于所述目标距离范围和所述定位特征图,分别确定该特征点与该特征点所属对象的对象边框中的每条边界的目标距离。S320: Based on the target distance range and the positioning feature map, respectively determine the target distance between the feature point and each boundary in the object frame of the object to which the feature point belongs.
在确定了目标距离范围之后,选取与目标距离范围匹配的回归网络,例如卷积神经网络,对定位特征图进行图像特征提取,以得到特征点与该特征点所属对象的对象边框中的每条边界的目标距离。After determining the target distance range, select a regression network that matches the target distance range, such as a convolutional neural network, and perform image feature extraction on the location feature map to obtain the feature point and each of the object borders of the object to which the feature point belongs The target distance of the boundary.
这里在确定了目标距离范围的基础上,进一步利用卷积神经网络确定了一个精确的距离,能够有效提高确定的距离的准确度。Here, on the basis of determining the target distance range, a convolutional neural network is further used to determine an accurate distance, which can effectively improve the accuracy of the determined distance.
另外,如图2所示,在确定了目标距离之后,可以利用一个预设的或训练好的参数或权重N对确定的目标距离进行校正,得到最终的目标距离。In addition, as shown in FIG. 2, after the target distance is determined, a preset or trained parameter or weight N can be used to correct the determined target distance to obtain the final target distance.
如图2所示,利用本步骤确定了特征点与左边界的精确的目标距离,该目标距离标注在图2中,用f表示。如图2所示,确定的目标距离位于确定的目标距离范围内。As shown in Figure 2, the precise target distance between the feature point and the left boundary is determined using this step. The target distance is marked in Figure 2 and denoted by f. As shown in Figure 2, the determined target distance is within the determined target distance range.
S330、基于该特征点在所述图像特征图中的位置信息,和该特征点与每条边界的目标距离,确定该特征点所属对象的对象边框信息。S330: Determine the object frame information of the object to which the feature point belongs based on the position information of the feature point in the image feature map and the target distance between the feature point and each boundary.
这里利用特征点在图像特征图中的位置信息和该特征点与每条边界的目标距离,能够确定对象边框信息对应的对象边框中每条边界在图像特征图中的位置信息。最后可以将对象边框中所有边界在图像特征图中的位置信息,作为特征点所属对象的对象边框信息。Here, the location information of the feature point in the image feature map and the target distance between the feature point and each boundary can be used to determine the location information of each boundary in the object frame corresponding to the object frame information in the image feature map. Finally, the position information of all boundaries in the object frame in the image feature map can be used as the object frame information of the object to which the feature point belongs.
上述实施例,首先确定特征点与对象边框中的每条边界的距离所位于的目标距离范围,之后,基于确定的目标距离范围,确定特征点与每条边界的目标距离,经过该两步处理能够提高确定的目标距离的准确度。之后,基于确定的该精确的目标距离,能够为特征点确定一个位置精确的对象边框,提高了确定的对象边框的准确度。In the above embodiment, the target distance range in which the distance between the characteristic point and each boundary in the object frame is first determined, and then, based on the determined target distance range, the target distance between the characteristic point and each boundary is determined, and the two-step processing is performed The accuracy of the determined target distance can be improved. After that, based on the determined precise target distance, an accurately positioned object frame can be determined for the feature point, which improves the accuracy of the determined object frame.
在一些实施例中,如图4所示,基于确定的所述第一概率值,从所述多个距离范围中,选取特征点与某条边界的距离所位于的目标距离范围,还可以利用步骤S410至S430实现。In some embodiments, as shown in FIG. 4, based on the determined first probability value, from the multiple distance ranges, select the target distance range in which the distance between the feature point and a certain boundary is located, and it can also use Steps S410 to S430 are implemented.
S410、基于所述定位特征图,确定该特征点与某条边界的距离的距离不确定参数(distance uncertainty parameter)值。S410: Based on the positioning feature map, determine a distance uncertainty parameter (distance uncertainty parameter) value of the distance between the feature point and a certain boundary.
这里,可以利用卷积神经网络,在确定特征点与某条边界的距离位于每个距离范围内的第一概率值的同时,确定该特征点与该条边界的距离的距离不确定参数值。这里的距离不确定参数值可以用于表征确定的各个第一概率的可信度。Here, a convolutional neural network can be used to determine the first probability value where the distance between the feature point and a certain boundary is within each distance range, and at the same time determine the distance uncertainty parameter value of the distance between the feature point and the boundary. The distance uncertainty parameter value here can be used to characterize the credibility of the determined first probabilities.
S420、基于所述距离不确定参数值和每个第一概率值,确定该特征点与该条边界的距离位于每个距离范围内的目标概率值。S420: Based on the distance uncertainty parameter value and each first probability value, determine the target probability value where the distance between the characteristic point and the boundary is within each distance range.
这里,利用距离不确定参数值对每个第一概率值进行修正,得到对应的目标概率值。Here, each first probability value is corrected by using the distance uncertainty parameter value to obtain the corresponding target probability value.
在具体实施时,可以利用如下公式确定目标概率值:In specific implementation, the following formula can be used to determine the target probability value:
式中,p x,n表示特征点与边界x的距离位于第n个距离范围内的目标概率值,N表示距离范围的数量,σ x表示与边界x对应的距离不确定参数值,s x,n表示特征点与边界x的距离位于第n个距离范围内的第一概率值;s x,m表示特征点与边界x的距离位于第m个距离范围内的第一概率值。 In the formula, p x, n represents the target probability value of the distance between the feature point and the boundary x within the nth distance range, N represents the number of the distance range, σ x represents the distance uncertainty parameter value corresponding to the boundary x, s x , N represents the first probability value that the distance between the feature point and the boundary x is within the n-th distance range; s x,m represents the first probability value that the distance between the feature point and the boundary x is within the m-th distance range.
S430、基于确定的所述目标概率值,从所述多个距离范围中,选取该特征点与该条边界的距离所位于的目标距离范围。S430: Based on the determined target probability value, select a target distance range in which the distance between the characteristic point and the boundary is located from the multiple distance ranges.
这里,具体地可以选取最大的目标概率值对应的距离范围作为目标距离范围。Here, specifically, the distance range corresponding to the maximum target probability value can be selected as the target distance range.
上述实施例,在确定特征点与某条边界的距离位于每个距离范围内的第一概率值的同时,还确定了一个距离不确定参数值,基于该参数值能够对第一概率进行修正或校正,得到特征点与某条边界的距离位于每个距离范围内的目标概率值,提高了确定的特征点与某条边界的距离位于每个距离范围内的概率值的准确度,从而有利于提高基于该概率值确定的目标距离范围的准确度。In the foregoing embodiment, while determining the first probability value at which the distance between the characteristic point and a certain boundary is within each distance range, a distance uncertainty parameter value is also determined, and the first probability can be corrected or corrected based on the parameter value. Correction to obtain the target probability value where the distance between the feature point and a certain boundary is within each distance range, which improves the accuracy of the probability value of the determined feature point and a certain boundary within each distance range, which is beneficial to Improve the accuracy of the target distance range determined based on the probability value.
在确定特征点与对应的对象边框中每条边界的目标距离之后,可以利用如下步骤确定对应的对象边框信息的置信度,即所述第二置信度:基于所述图像特征图中的特征点与该特征点所属对象的对象边框中每条边界的距离所位于的目标距离范围对应的第一概率值,确定该特征点所属对象的对象边框信息的第二置信度。After determining the target distance between the feature point and each boundary in the corresponding object frame, the following steps can be used to determine the confidence of the corresponding object frame information, that is, the second confidence: based on the feature points in the image feature map The first probability value corresponding to the target distance range in which the distance of each boundary in the object frame of the object to which the feature point belongs determines the second confidence level of the object frame information of the object to which the feature point belongs.
在具体实施时,可以将特征点与特征点所属对象的对象边框中所有边界的距离所位于的目标距离范围对应的第一概率值的均值作为所述第二置信度。In a specific implementation, the average value of the first probability value corresponding to the target distance range where the distance between the feature point and all boundaries in the object frame of the object to which the feature point belongs may be used as the second confidence.
当然,还可以利用其他方法确定第二置信度,本公开对基于目标距离范围对应的第一概率值确定第二置信度的方法并不进行限定。Of course, other methods may also be used to determine the second confidence level, and the present disclosure does not limit the method for determining the second confidence level based on the first probability value corresponding to the target distance range.
上述实施方式,利用特征点与每条边界的距离所位于的距离范围对应的第一概率值,能够确定该特征点所属对象的对象边框信息的置信度,即所述第二置信度,增强了对象边框的信息表达能力。In the foregoing embodiment, the first probability value corresponding to the distance range in which the distance between the feature point and each boundary is located can be used to determine the confidence level of the object frame information of the object to which the feature point belongs, that is, the second confidence level, which enhances The information expression ability of the object frame.
在一些实施例中,基于所述分类特征图,确定所述图像特征图中每个特征点所属对象的对象类型信息,可以利用如下步骤实现:基于所述分类特征图,确定所述图像特征图中每个特征点所属的对象为每种预设对象类型的第二概率值;基于最大的所述第二概率值对应的预设对象类型,确定该特征点所属对象的对象类型信息。In some embodiments, based on the classification feature map, determining the object type information of the object to which each feature point in the image feature map belongs can be achieved by using the following steps: based on the classification feature map, determining the image feature map The object to which each feature point belongs is the second probability value of each preset object type; based on the preset object type corresponding to the largest second probability value, the object type information of the object to which the feature point belongs is determined.
在具体实施时,可以利用卷积神经网络或卷积层对分类特征图进行图像特征提取,得到特征点所属的对象为每种预设对象类型的第二概率值。之后,选取最大的第二概率值对应的预设对象类型,来确定该特征点所属对象的对象类型信息。如图2所示,利用本实施例确定的预设对象类型“猫”对应的第二概率值最大,因此确定对象类型信息与猫对应。应注意,在本文中,不同操作可使用同一卷积神经网络的不同部分。In a specific implementation, a convolutional neural network or a convolutional layer may be used to perform image feature extraction on the classification feature map to obtain the second probability value of the object to which the feature point belongs is each preset object type. Then, the preset object type corresponding to the largest second probability value is selected to determine the object type information of the object to which the feature point belongs. As shown in FIG. 2, the second probability value corresponding to the preset object type "cat" determined by this embodiment is the largest, so it is determined that the object type information corresponds to a cat. It should be noted that in this article, different operations can use different parts of the same convolutional neural network.
上述实施方式,选取最大第二概率值对应的预设对象类型作为该特征点所属对象的对象类型信息,提高了确定的对象类型信息的准确度。In the foregoing embodiment, the preset object type corresponding to the largest second probability value is selected as the object type information of the object to which the feature point belongs, which improves the accuracy of the determined object type information.
在一些实施例中,如图5所示,基于每个特征点所属对象的对象边框信息和所述对象边框信息的目标置信度,确定所述目标图像中对象的定位信息,可以利用步骤S510至S530实现。In some embodiments, as shown in FIG. 5, based on the object frame information of the object to which each feature point belongs and the target confidence of the object frame information, to determine the location information of the object in the target image, steps S510 to S530 implementation.
S510、从所述图像特征图中筛选出多个目标特征点,其中,所述多个目标特征点彼此之间的距离小于预设阈值,并且各个目标特征点所属对象的对象类型信息相同。S510. Filter out multiple target feature points from the image feature map, where the distance between the multiple target feature points is less than a preset threshold, and the object type information of the object to which each target feature point belongs is the same.
这里,筛选得到的多个目标特征点为属于同一个对象的特征点。Here, the multiple target feature points obtained by screening are feature points belonging to the same object.
S520、从每个目标特征点所属对象的对象边框信息中,选取具有最高目标置信度的对象边框信息,作为目标边框信息。S520: From the object frame information of the object to which each target feature point belongs, select the object frame information with the highest target confidence as the target frame information.
对于属于同一对象的特征点,可以选取最高的目标置信度对应的对象边框信息对对象进行定位,其他目标置信度较低的对象边框信息可以剔除,以降低对象定位过程中的计算量。For feature points belonging to the same object, the object frame information corresponding to the highest target confidence can be selected to locate the object, and other object frame information with lower target confidence can be eliminated to reduce the amount of calculation in the object positioning process.
S530、基于选取的所述目标边框信息,以及所述目标边框信息的目标置信度,确定所述目标图像中对象的定位信息。S530: Determine positioning information of the object in the target image based on the selected target frame information and the target confidence of the target frame information.
上述实施方式,从距离比较近的、对象类型信息相同的特征点对应的对象边框信息中选取目标置信度最高的对象边框信息,来进行对象的定位,能够有效减少用于进行对象定位的对象边框信息的数量,有利于提高对象定位的时效性。In the above embodiment, the object frame information with the highest target confidence is selected from the object frame information corresponding to the feature points with the same object type information at a relatively close distance to locate the object, which can effectively reduce the object frame used for object positioning. The amount of information is conducive to improving the timeliness of object positioning.
对应于所述定位方法,本公开实施例还提供了一种定位装置,该装置对图像中的对象进行定位的终端设备上,并且该装置及其各个模块能够执行与所述定位方法相同的方法步骤,并且能够达到相同或相似的有益效果,因此对于重复的部分不再赘述。Corresponding to the positioning method, the embodiments of the present disclosure also provide a positioning device, which locates an object in an image on a terminal device, and the device and its various modules can perform the same method as the positioning method. Steps, and can achieve the same or similar beneficial effects, so the repeated parts will not be repeated.
如图6所示,本公开提供的定位装置包括:As shown in Figure 6, the positioning device provided by the present disclosure includes:
图像获取模块610,用于获取目标图像,其中所述目标图像包括至少一个待定位的对象。The
图像处理模块620,用于基于所述目标图像的图像特征图,确定所述图像特征图中每个特征点所属对象的对象类型信息、每个特征点所属对象的对象边框信息、所述对象类型信息的第一置信度和所述对象边框信息的第二置信度。The
置信度处理模块630,用于基于所述第一置信度和所述第二置信度,分别确定每个特征点所属对象的对象边框信息的目标置信度。The
定位模块640,用于基于每个特征点所属对象的对象边框信息和所述对象边框信 息的目标置信度,确定所述目标图像中对象的定位信息。The
在一些实施例中,所述图像特征图包括用于对所述图像特征图中的特征点所属的对象进行分类的分类特征图和用于对所述图像特征图中的特征点所属的对象进行定位的定位特征图。In some embodiments, the image feature map includes a classification feature map for classifying objects to which feature points in the image feature map belong, and a classification feature map for classifying objects to which the feature points in the image feature map belong. Positioning feature map for positioning.
所述图像处理模块620用于:The
基于所述分类特征图,确定所述图像特征图中每个特征点所属对象的对象类型信息,和所述对象类型信息的第一置信度;Based on the classification feature map, determine the object type information of the object to which each feature point in the image feature map belongs, and the first confidence level of the object type information;
基于所述定位特征图,确定所述图像特征图中每个特征点所属对象的对象边框信息,和所述对象边框信息的第二置信度。Based on the positioning feature map, determine the object frame information of the object to which each feature point in the image feature map belongs, and the second confidence level of the object frame information.
在一些实施例中,所述图像处理模块620在基于所述定位特征图,确定所述图像特征图中每个特征点所属对象的对象边框信息时,用于:In some embodiments, when the
针对所述图像特征图中的每个特征点,基于所述定位特征图,分别确定该特征点与该特征点所属对象的对象边框中的每条边界的距离各自所位于的目标距离范围;For each feature point in the image feature map, based on the positioning feature map, respectively determine the target distance range where the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs;
基于所述目标距离范围和所述定位特征图,分别确定该特征点与该特征点所属对象的对象边框中的每条边界的目标距离;Based on the target distance range and the positioning feature map, respectively determine the target distance between the feature point and each boundary in the object frame of the object to which the feature point belongs;
基于该特征点在所述图像特征图中的位置信息,和该特征点与每条边界的目标距离,确定该特征点所属对象的对象边框信息。Based on the position information of the feature point in the image feature map and the target distance between the feature point and each boundary, the object frame information of the object to which the feature point belongs is determined.
在一些实施例中,所述图像处理模块620在确定一个特征点与该特征点所属对象的对象边框中的每条边界的距离各自所位于的目标距离范围时,用于:In some embodiments, the
针对该特征点所属对象的对象边框中的每条边界,基于所述定位特征图,确定该特征点与该条边界的最大距离;For each boundary in the object frame of the object to which the feature point belongs, determine the maximum distance between the feature point and the boundary based on the positioning feature map;
将所述最大距离进行分段处理,得到多个距离范围;Perform segmentation processing on the maximum distance to obtain multiple distance ranges;
基于所述定位特征图,确定该特征点与该条边界的距离位于每个距离范围内的第一概率值;Based on the positioning feature map, determine the first probability value at which the distance between the feature point and the boundary is within each distance range;
基于确定的所述第一概率值,从所述多个距离范围中,选取该特征点与该条边界的距离所位于的目标距离范围。Based on the determined first probability value, select the target distance range in which the distance between the characteristic point and the boundary is located from the plurality of distance ranges.
在一些实施例中,所述图像处理模块在基于确定的所述第一概率值,从所述多个距离范围中,选取该特征点与该条边界的距离所位于的目标距离范围时,用于:In some embodiments, when the image processing module selects the target distance range in which the distance between the characteristic point and the boundary is located from the plurality of distance ranges based on the determined first probability value, use At:
将最大的所述第一概率值对应的距离范围作为所述目标距离范围。The distance range corresponding to the largest first probability value is used as the target distance range.
在一些实施例中,所述图像处理模块620在基于确定的所述第一概率值,从所述多个距离范围中,选取该特征点与该条边界的距离所位于的目标距离范围时,用于:In some embodiments, when the
基于所述定位特征图,确定该特征点与该条边界的距离的距离不确定参数值;Based on the positioning feature map, determine the distance uncertainty parameter value of the distance between the feature point and the border;
基于所述距离不确定参数值和每个第一概率值,确定该特征点与该条边界的距离位于每个距离范围内的目标概率值;Based on the distance uncertainty parameter value and each first probability value, determine the target probability value where the distance between the characteristic point and the boundary is within each distance range;
将最大的目标概率值对应的距离范围作为该特征点与该条边界的距离所位于的目标距离范围。The distance range corresponding to the maximum target probability value is taken as the target distance range where the distance between the characteristic point and the boundary is located.
在一些实施例中,所述图像处理模块620在确定所述对象边框信息的第二置信度时,用于:In some embodiments, when the
基于所述图像特征图中的一个特征点与该特征点所属对象的对象边框中每条边界的距离各自所位于的目标距离范围对应的第一概率值,确定该特征点所属对象的对象边框信息的第二置信度。Determine the object frame information of the object to which the feature point belongs based on the first probability value corresponding to the distance between a feature point in the image feature map and each boundary in the object frame of the object to which the feature point belongs. The second degree of confidence.
在一些实施例中,所述图像处理模块在确定该特征点所属对象的对象边框信息的第二置信度时,用于:In some embodiments, when the image processing module determines the second confidence level of the object frame information of the object to which the characteristic point belongs, it is used to:
获取该特征点与该特征点所属对象的对象边框中每条边界的距离各自所位于的目标距离范围对应的第一概率值的均值;Acquiring the average value of the first probability value corresponding to the target distance range in which the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs;
确定该均值作为所述第二置信度。Determine the mean value as the second confidence level.
在一些实施例中,所述图像处理模块620在基于所述分类特征图,确定所述图像特征图中每个特征点所属对象的对象类型信息时,用于:In some embodiments, when the
基于所述分类特征图,确定所述图像特征图中每个特征点所属的对象为每种预设对象类型的第二概率值;Based on the classification feature map, determining that the object to which each feature point belongs in the image feature map is a second probability value of each preset object type;
基于最大的所述第二概率值对应的预设对象类型,确定该特征点所属对象的对象类型信息。Based on the preset object type corresponding to the largest second probability value, the object type information of the object to which the feature point belongs is determined.
在一些实施例中,所述定位模块640用于:In some embodiments, the
从所述图像特征图中筛选出多个目标特征点,其中,所述多个目标特征点彼此之间的距离小于预设阈值,并且各个目标特征点所属对象的对象类型信息相同;Multiple target feature points are filtered from the image feature map, wherein the distance between the multiple target feature points is less than a preset threshold, and the object type information of the object to which each target feature point belongs is the same;
从每个目标特征点所属对象的对象边框信息中,选取具有最高目标置信度的对象边框信息,作为目标边框信息;From the object frame information of the object to which each target feature point belongs, select the object frame information with the highest target confidence as the target frame information;
基于选取的所述目标边框信息,以及所述目标边框信息的目标置信度,确定所述目标图像中对象的定位信息。Based on the selected target frame information and the target confidence of the target frame information, the positioning information of the object in the target image is determined.
本公开实施例公开了一种电子设备,如图7所示,包括:处理器701、存储器702和总线703,所述存储器702存储有所述处理器701可执行的机器可读指令,当电子设备运行时,所述处理器701与所述存储器702之间通过总线703通信。The embodiment of the present disclosure discloses an electronic device, as shown in FIG. 7, comprising: a processor 701, a memory 702, and a bus 703. The memory 702 stores machine-readable instructions executable by the processor 701. When the device is running, the processor 701 and the memory 702 communicate through the bus 703.
所述机器可读指令被所述处理器701执行时执行以下定位方法的步骤:When the machine-readable instruction is executed by the processor 701, the following positioning method steps are executed:
获取目标图像,其中所述目标图像包括至少一个待定位的对象;Acquiring a target image, where the target image includes at least one object to be positioned;
基于所述目标图像的图像特征图,确定所述图像特征图中每个特征点所属对象的对象类型信息、每个特征点所属对象的对象边框信息、所述对象类型信息的第一置信度和所述对象边框信息的第二置信度;Based on the image feature map of the target image, determine the object type information of the object to which each feature point belongs in the image feature map, the object frame information of the object to which each feature point belongs, the first confidence level of the object type information, and The second confidence level of the object frame information;
基于所述第一置信度和所述第二置信度,分别确定每个特征点所属对象的对象 边框信息的目标置信度;Based on the first confidence level and the second confidence level, respectively determine the target confidence level of the object frame information of the object to which each feature point belongs;
基于每个特征点所属对象的对象边框信息和所述对象边框信息的目标置信度,确定所述目标图像中对象的定位信息。Based on the object frame information of the object to which each feature point belongs and the target confidence of the object frame information, the positioning information of the object in the target image is determined.
除此之外,机器可读指令被处理器701执行时,还可以执行所述方法部分描述的任一实施方式中的方法内容,这里不再赘述。In addition, when the machine-readable instruction is executed by the processor 701, the method content in any of the implementation manners described in the method section can also be executed, which will not be repeated here.
本公开实施例还提供的一种对应于所述方法及装置的计算机程序产品,包括存储了程序代码的计算机可读存储介质,程序代码包括的指令可用于执行前面方法实施例中的方法,具体实现可参见方法实施例,在此不再赘述。The embodiment of the present disclosure also provides a computer program product corresponding to the method and device, which includes a computer-readable storage medium storing program code. The instructions included in the program code can be used to execute the method in the previous method embodiment. For implementation, refer to the method embodiment, which will not be repeated here.
上文对各个实施例的描述倾向于强调各个实施例之间的不同之处,其相同或相似之处可以相互参考,为了简洁,本文不再赘述。The above description of the various embodiments tends to emphasize the differences between the various embodiments, and the same or similarities can be referred to each other. For the sake of brevity, the details are not repeated herein.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统和装置的具体工作过程,可以参考方法实施例中的对应过程,本公开中不再赘述。在本公开所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。Those skilled in the art can clearly understand that, for convenience and concise description, the specific working process of the system and device described above can refer to the corresponding process in the method embodiment, which will not be repeated in this disclosure. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, device, and method may be implemented in other ways. The device embodiments described above are merely illustrative. For example, the division of the modules is only a logical function division, and there may be other divisions in actual implementation. For example, multiple modules or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some communication interfaces, devices or modules, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, the functional units in the various embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。If the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a non-volatile computer readable storage medium executable by a processor. Based on this understanding, the technical solution of the present disclosure essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present disclosure. The aforementioned storage media include: U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk and other media that can store program codes.
以上仅为本公开的具体实施方式,但本公开的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应以权利要求的保护范围为准。The above are only specific implementations of the present disclosure, but the protection scope of the present disclosure is not limited thereto. Any person skilled in the art can easily conceive of changes or substitutions within the technical scope disclosed in the present disclosure, and they shall be covered Within the protection scope of this disclosure. Therefore, the protection scope of the present disclosure should be subject to the protection scope of the claims.
Claims (22)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2022500616A JP2022540101A (en) | 2020-01-18 | 2021-01-15 | POSITIONING METHOD AND APPARATUS, ELECTRONIC DEVICE, COMPUTER-READABLE STORAGE MEDIUM |
| KR1020227018711A KR20220093187A (en) | 2020-01-18 | 2021-01-15 | Positioning method and apparatus, electronic device, computer readable storage medium |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010058788.7A CN111275040B (en) | 2020-01-18 | 2020-01-18 | Positioning method and device, electronic device, computer-readable storage medium |
| CN202010058788.7 | 2020-01-18 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2021143865A1 true WO2021143865A1 (en) | 2021-07-22 |
Family
ID=70998770
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2021/072210 Ceased WO2021143865A1 (en) | 2020-01-18 | 2021-01-15 | Positioning method and apparatus, electronic device, and computer readable storage medium |
Country Status (4)
| Country | Link |
|---|---|
| JP (1) | JP2022540101A (en) |
| KR (1) | KR20220093187A (en) |
| CN (1) | CN111275040B (en) |
| WO (1) | WO2021143865A1 (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113762109A (en) * | 2021-08-23 | 2021-12-07 | 北京百度网讯科技有限公司 | Training method of character positioning model and character positioning method |
Families Citing this family (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111275040B (en) * | 2020-01-18 | 2023-07-25 | 北京市商汤科技开发有限公司 | Positioning method and device, electronic device, computer-readable storage medium |
| CN111931723B (en) * | 2020-09-23 | 2021-01-05 | 北京易真学思教育科技有限公司 | Target detection and image recognition method and device, and computer readable medium |
| CN114613147B (en) * | 2020-11-25 | 2023-08-04 | 浙江宇视科技有限公司 | A method, device, medium and electronic equipment for identifying vehicle violations |
| CN112819003B (en) * | 2021-04-19 | 2021-08-27 | 北京妙医佳健康科技集团有限公司 | Method and device for improving OCR recognition accuracy of physical examination report |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108764292A (en) * | 2018-04-27 | 2018-11-06 | 北京大学 | Deep learning image object mapping based on Weakly supervised information and localization method |
| US20190035101A1 (en) * | 2017-07-27 | 2019-01-31 | Here Global B.V. | Method, apparatus, and system for real-time object detection using a cursor recurrent neural network |
| CN109426803A (en) * | 2017-09-04 | 2019-03-05 | 三星电子株式会社 | The method and apparatus of object for identification |
| CN109522938A (en) * | 2018-10-26 | 2019-03-26 | 华南理工大学 | The recognition methods of target in a kind of image based on deep learning |
| CN111275040A (en) * | 2020-01-18 | 2020-06-12 | 北京市商汤科技开发有限公司 | Positioning method and apparatus, electronic device, computer-readable storage medium |
-
2020
- 2020-01-18 CN CN202010058788.7A patent/CN111275040B/en not_active Expired - Fee Related
-
2021
- 2021-01-15 KR KR1020227018711A patent/KR20220093187A/en not_active Withdrawn
- 2021-01-15 JP JP2022500616A patent/JP2022540101A/en not_active Withdrawn
- 2021-01-15 WO PCT/CN2021/072210 patent/WO2021143865A1/en not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190035101A1 (en) * | 2017-07-27 | 2019-01-31 | Here Global B.V. | Method, apparatus, and system for real-time object detection using a cursor recurrent neural network |
| CN109426803A (en) * | 2017-09-04 | 2019-03-05 | 三星电子株式会社 | The method and apparatus of object for identification |
| CN108764292A (en) * | 2018-04-27 | 2018-11-06 | 北京大学 | Deep learning image object mapping based on Weakly supervised information and localization method |
| CN109522938A (en) * | 2018-10-26 | 2019-03-26 | 华南理工大学 | The recognition methods of target in a kind of image based on deep learning |
| CN111275040A (en) * | 2020-01-18 | 2020-06-12 | 北京市商汤科技开发有限公司 | Positioning method and apparatus, electronic device, computer-readable storage medium |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113762109A (en) * | 2021-08-23 | 2021-12-07 | 北京百度网讯科技有限公司 | Training method of character positioning model and character positioning method |
| CN113762109B (en) * | 2021-08-23 | 2023-11-07 | 北京百度网讯科技有限公司 | Training method of character positioning model and character positioning method |
Also Published As
| Publication number | Publication date |
|---|---|
| JP2022540101A (en) | 2022-09-14 |
| KR20220093187A (en) | 2022-07-05 |
| CN111275040B (en) | 2023-07-25 |
| CN111275040A (en) | 2020-06-12 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2021143865A1 (en) | Positioning method and apparatus, electronic device, and computer readable storage medium | |
| WO2020252917A1 (en) | Fuzzy face image recognition method and apparatus, terminal device, and medium | |
| WO2022033150A1 (en) | Image recognition method, apparatus, electronic device, and storage medium | |
| WO2021036059A1 (en) | Image conversion model training method, heterogeneous face recognition method, device and apparatus | |
| WO2019033574A1 (en) | Electronic device, dynamic video face recognition method and system, and storage medium | |
| WO2020037932A1 (en) | Image quality assessment method, apparatus, electronic device and computer readable storage medium | |
| CN111339979B (en) | Image recognition method and image recognition device based on feature extraction | |
| WO2019109526A1 (en) | Method and device for age recognition of face image, storage medium | |
| WO2019041519A1 (en) | Target tracking device and method, and computer-readable storage medium | |
| WO2021004186A1 (en) | Face collection method, apparatus, system, device, and medium | |
| CN109033955B (en) | A face tracking method and system | |
| CN113221771A (en) | Living body face recognition method, living body face recognition device, living body face recognition equipment, storage medium and program product | |
| CN115205939B (en) | Training method and device for human face living body detection model, electronic equipment and storage medium | |
| CN112200056B (en) | Face living body detection method and device, electronic equipment and storage medium | |
| CN112561879B (en) | Blurry evaluation model training method, image blurriness evaluation method and device | |
| CN114627534B (en) | Living body discriminating method, electronic apparatus, and storage medium | |
| CN110381392A (en) | A kind of video abstraction extraction method and its system, device, storage medium | |
| CN116091781B (en) | Data processing method and device for image recognition | |
| CN118522057B (en) | Facial expression recognition method, device, electronic device and storage medium | |
| CN114067394A (en) | Face living body detection method and device, electronic equipment and storage medium | |
| WO2020172870A1 (en) | Method and apparatus for determining motion trajectory of target object | |
| CN111524161B (en) | Method and device for extracting track | |
| CN114220045A (en) | Object recognition method, device and computer-readable storage medium | |
| CN116758457A (en) | Target detection methods, devices, equipment and media | |
| CN114692759A (en) | Video face classification method, device, equipment and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21740651 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2022500616 Country of ref document: JP Kind code of ref document: A |
|
| ENP | Entry into the national phase |
Ref document number: 20227018711 Country of ref document: KR Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 21740651 Country of ref document: EP Kind code of ref document: A1 |