[go: up one dir, main page]

CN107301376B - A Pedestrian Detection Method Based on Deep Learning Multi-layer Stimulation - Google Patents

A Pedestrian Detection Method Based on Deep Learning Multi-layer Stimulation Download PDF

Info

Publication number
CN107301376B
CN107301376B CN201710385952.3A CN201710385952A CN107301376B CN 107301376 B CN107301376 B CN 107301376B CN 201710385952 A CN201710385952 A CN 201710385952A CN 107301376 B CN107301376 B CN 107301376B
Authority
CN
China
Prior art keywords
pedestrian
frame
candidate
target
stimulation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710385952.3A
Other languages
Chinese (zh)
Other versions
CN107301376A (en
Inventor
李玺
李健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201710385952.3A priority Critical patent/CN107301376B/en
Publication of CN107301376A publication Critical patent/CN107301376A/en
Application granted granted Critical
Publication of CN107301376B publication Critical patent/CN107301376B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses a pedestrian detection method based on deep learning multilayer stimulation, which is used for marking the position of a target appearing in a video after video monitoring and the target needing to be detected are given. The method specifically comprises the following steps: acquiring a pedestrian data set used for training a target detection model, and defining an algorithm target; modeling the position deviation and the apparent semantics of the pedestrian target; establishing a pedestrian multilayer stimulation network model according to the modeling result in the step S2; the pedestrian position in the monitoring image is detected using the detection model. The pedestrian detection method is suitable for pedestrian detection in real video monitoring images, and has better effect and robustness in the face of various complex conditions.

Description

Pedestrian detection method based on deep learning multi-layer stimulation
Technical Field
The invention belongs to the field of computer vision, and particularly relates to a pedestrian detection method based on deep learning multi-layer stimulation.
Background
With the development of computer vision since the end of the 20 th century, intelligent video processing technology has gained widespread attention and research. Pedestrian detection is an important and challenging task, with the goal of accurately detecting the location of pedestrians in video surveillance images. The problem has high application value in the fields of video monitoring, intelligent robots and the like, and is the basis of a large number of high-level visual tasks. However, the problem is also more challenging, namely how to express the target region information; secondly, how to uniformly model and optimize the extraction of the candidate region and the target classification, and the challenges put higher requirements on the performance and the robustness of the corresponding algorithm.
The general pedestrian detection algorithm is divided into three parts: 1. candidate regions containing the target in the input image are found. 2. And manually extracting target features based on the candidate regions. 3. And (4) realizing a detection task by using a classification algorithm on the features. The method mainly has the following problems: 1) the pedestrian detection method is based on the traditional visual features, the visual features can only express visual information of a lower layer, but a pedestrian detection task needs a model with high-level abstract semantic understanding capability; 2) the extraction of candidate regions and the classification of features are not optimized by end-to-end learning; 3) the features extracted based on deep learning are not subjected to multi-layer stimulation combination, and the target features are not abstract enough.
Disclosure of Invention
To solve the above problems, it is an object of the present invention to provide a pedestrian detection method based on deep learning multi-layer stimuli for detecting the pedestrian position in a given monitoring image. The method is based on a deep neural network, utilizes the depth visual characteristics of multi-layer stimulation to represent the information of a target area, adopts the Faster R-CNN framework to model pedestrian detection, and can better adapt to the complex situation in a real video monitoring scene.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a pedestrian detection method based on deep learning multi-layer stimulation comprises the following steps:
s1, acquiring a pedestrian data set for training a target detection model, and defining an algorithm target;
s2, modeling the position deviation and the apparent semantic meaning of the pedestrian target;
s3, establishing a pedestrian multilayer stimulation network model according to the modeling result in the step S2;
and S4, detecting the pedestrian position in the monitored image by using the detection model.
Further, in step S1, the pedestrian data set for training the target detection model includes a pedestrian image XtrainA pedestrian position B marked manually;
the algorithm targets are defined as: the pedestrian position P in one monitor image X is detected.
Further, in step S2, the modeling of the position deviation and the apparent semantic meaning of the pedestrian object specifically includes:
s21, according to the pedestrian data set XtrainAnd pedestrian position P modeling position deviation:
Figure BDA0001306371040000021
Figure BDA0001306371040000022
wherein, x, y are the coordinate of the middle point of the pedestrian frame label, w, h are the width and the length of the pedestrian frame label, and xa,yaIs the coordinate of the pedestrian candidate frame, wa,haIs the width and length of the pedestrian candidate frame; t is txAs the ratio of the deviation of the x coordinate of the pedestrian frame relative to the x coordinate of the marking frame to the width of the marking frame, tyAs the proportion of the deviation of the y coordinate of the pedestrian frame relative to the y coordinate of the marking frame corresponding to the length of the marking frame, twAs the ratio of the width of the pedestrian frame to the width of the marking frame, thThe length of the pedestrian frame is in proportion to the length of the marking frame;
s22, according to the pedestrian data set XtrainAnd pedestrian position P modeling appearance semantics:
s=<w,d>
Figure BDA0001306371040000023
where s represents the projection value of the feature d onto a projection vector w, w is the pedestrian weight projection vector, d is the pedestrian feature descriptor,<.,.>is the inner product operator, p (C ═ k | d) is the softmax function, indicating the probability values belonging to class k; sjIs the projection value of the feature d on the jth projection vector w; c is a discrete random variable with the value number of k; j is the index of the jth w of the total projection vectors w.
Further, in step S3, the step of establishing the pedestrian multi-layer stimulation network model according to the modeling result in step S2 specifically includes:
s31, establishing a multilayer stimulation convolutional neural network, wherein the input of the neural network is a monitoring image X and a pedestrian marking box B, and the output is a probability value p of a corresponding pedestrian candidate box and a pedestrian position deviation O in the X; the structure of the neural network is represented as mapping X → (p, O);
s32, child mapping X → p uses the soft maximum Softmax loss function, expressed as
Figure BDA0001306371040000031
Lcls(X,Y;θ)=-∑jYjlogp (C | d) formula (3)
Wherein Y is a binary vector, if the k-th class belongs to, the corresponding value is 1, and the rest is 0; l iscls(X, Y; θ) represents the softmax loss function of the entire training data set;
s33, child mapping X → O Using Euclidean loss function, expressed as
Lloc(t,v)=∑ismooth(ti,vi)
Figure BDA0001306371040000032
Wherein t isiIs a pedestrian position deviation tag, viIs a pedestrian position deviation predicted value; i represents the ith training sample;
s34 loss function of the whole multi-layer stimulation neural network
L=Lcls+LlocFormula (5)
The entire neural network is trained using a stochastic gradient descent and back propagation algorithm under a loss function L.
Further, in step S4, the detecting the pedestrian position in the monitoring image includes: inputting the monitoring image X to be detected into the trained neural network, judging whether the image X is a pedestrian according to the output candidate box probability value, and finally correcting according to the predicted position deviation O to obtain the pedestrian position P.
Compared with the existing pedestrian detection method, the pedestrian detection method applied to the video monitoring scene has the following beneficial effects:
firstly, the pedestrian detection method of the invention builds a model based on a deep convolutional neural network. The invention unifies the generation of the candidate region and the classification of the characteristics in the same network frame for learning and optimization, thereby improving the final effect of the method.
Secondly, the multi-layer stimulation algorithm provided by the invention can enrich the feature abstract capability, and meanwhile, the features learned by the algorithm enable the classifier to learn more robust classification rules.
The pedestrian detection method applied to the video monitoring scene has good application value in an intelligent video analysis system, and can effectively improve the efficiency and accuracy of pedestrian detection. For example, in traffic video monitoring, the pedestrian detection method can quickly and accurately detect the positions of all pedestrians, provide data for subsequent pedestrian search tasks, and greatly release human resources.
Drawings
FIG. 1 is a schematic flow chart of a pedestrian detection method applied to a video surveillance scene according to the present invention;
FIG. 2 is a schematic diagram of the loss function of the whole multi-layer neural network according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
On the contrary, the invention is intended to cover alternatives, modifications, equivalents and alternatives which may be included within the spirit and scope of the invention as defined by the appended claims. Furthermore, in the following detailed description of the present invention, certain specific details are set forth in order to provide a better understanding of the present invention. It will be apparent to one skilled in the art that the present invention may be practiced without these specific details.
Referring to fig. 1, in a preferred embodiment of the present invention, a pedestrian detection method based on deep learning multi-layer stimulation comprises the following steps:
first, a pedestrian data set including a pedestrian image X for training a target detection model is acquiredtrainA pedestrian position B marked manually;
the algorithm targets are defined as: the pedestrian position P in one monitor image X is detected.
Secondly, modeling the position deviation and the apparent semantics of the pedestrian target specifically comprises:
first, from a pedestrian data set XtrainAnd pedestrian position P modeling position deviation:
Figure BDA0001306371040000041
Figure BDA0001306371040000042
wherein, x, y are the coordinate of the middle point of the pedestrian frame label, w, h are the width and the length of the pedestrian frame label, and xa,yaIs the coordinate of the pedestrian candidate frame, wa,haIs the width and length of the pedestrian candidate frame; t is txAs the ratio of the deviation of the x coordinate of the pedestrian frame relative to the x coordinate of the marking frame to the width of the marking frame, tyAs the proportion of the deviation of the y coordinate of the pedestrian frame relative to the y coordinate of the marking frame corresponding to the length of the marking frame, twAs the ratio of the width of the pedestrian frame to the width of the marking frame, thThe length of the pedestrian frame is in proportion to the length of the marking frame;
second, from the pedestrian data set XtrainAnd pedestrian position P modeling appearance semantics:
s=<w,d>
Figure BDA0001306371040000051
where s represents the projection value of the feature d onto a projection vector w, w is the pedestrian weight projection vector, d is the pedestrian feature descriptor,<.,.>is the inner product operator, p (C ═ k | d) is the softmax function, indicating the probability values belonging to class k; sjIs the projection value of the feature d on the jth projection vector w; c is a discrete random variable with the value number of k; j is the index of the jth w of the total projection vectors w.
And then, pre-training a detection model of the billboard target according to the complaint modeling result. The method specifically comprises the following steps:
firstly, establishing a multilayer stimulation convolutional neural network, wherein the input of the neural network is a monitoring image X and a pedestrian marking frame B, and the output is a probability value p of a corresponding pedestrian candidate frame and a pedestrian position deviation O in the X; thus, the structure of the neural network can be represented as the mapping X → (p, O);
second, the sub-map X → p uses a soft maximum (Softmax) loss function, denoted as
Figure BDA0001306371040000052
Lcls(X,Y;θ)=-∑jYjlogp (C | d) formula (3)
Wherein Y is a binary vector, if the k-th class belongs to, the corresponding value is 1, and the rest is 0; l iscls(X, Y; θ) represents the softmax loss function of the entire training data set;
third, the sub-map X → O uses the Euclidean loss function, expressed as
Lloc(t,v)=∑ismooth(ti,vi)
Figure BDA0001306371040000053
Wherein t isiIs a pedestrian position deviation tag, viIs the predicted value of the pedestrian position deviation, and i represents the ith training sample.
Fourth, referring to FIG. 2, the loss function of the entire multi-layer neural network is
L=Lcls+LlocFormula (5)
The entire neural network is trained using a stochastic gradient descent and back propagation algorithm under a loss function L.
And finally, detecting the pedestrians in the monitoring image by using the trained detection model. The method specifically comprises the following steps: and (4) placing the preprocessed image on a multi-layer stimulation detection framework for calculation. The multi-layer stimulation detection framework extracts candidate frames by using 3 RPN networks, and the feature information utilized by each RPN network is different, so that the sizes and the scales of the obtained candidate frames are different. And firstly obtaining candidate frames extracted by each RPN network, and filtering according to the respective confidence degrees to obtain 300 candidate regions. Then, the candidate regions in the 3 RPN networks are merged to obtain 900 candidate regions. And then, according to the arrangement of the classification confidence degrees from large to small, filtering to obtain the final 300 target candidate regions. And filtering the candidate frames according to whether the output candidate frame classification probability value is greater than a given threshold value or not, eliminating the crossed and repeated detection frames by adopting a non-maximum value inhibition algorithm, and finally correcting according to the predicted position deviation O to obtain the position P of the pedestrian.
In the above embodiment, the pedestrian detection method of the invention first models the position deviation and the apparent semantic meaning of the pedestrian target. On the basis, the original problem is converted into a multi-task learning problem, and a pedestrian detection model is established based on the deep neural network. And finally, detecting the position of the pedestrian in the monitoring image by using the trained detection model.
Through the technical scheme, the embodiment of the invention develops a pedestrian detection algorithm based on deep learning multi-layer stimulation based on the deep learning technology. The invention can effectively model the position deviation and the apparent semantic information of the target at the same time, thereby detecting the accurate pedestrian position.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (1)

1.一种基于深度学习多层刺激的行人检测方法,其特征在于,包括以下步骤:1. a pedestrian detection method based on deep learning multilayer stimulation, is characterized in that, comprises the following steps: S1、获取用于训练目标检测模型的行人数据集,并定义算法目标;所述的用于训练目标检测模型的行人数据集,包括行人图像Xtrain,人工标注的行人位置B;定义算法目标为:检测一幅监控图像X中的行人位置P;S1. Acquire a pedestrian data set for training a target detection model, and define an algorithm target; the pedestrian data set used for training a target detection model includes a pedestrian image X train and a manually marked pedestrian position B; the algorithm target is defined as : Detect the pedestrian position P in a monitoring image X; S2、对行人目标的位置偏差和表观语义进行建模,具体包括:S2. Model the positional deviation and apparent semantics of pedestrian targets, including: S21、根据行人数据集Xtrain和行人位置P建模位置偏差:S21. Model the position deviation according to the pedestrian dataset X train and the pedestrian position P:
Figure FDA0002946034640000011
Figure FDA0002946034640000011
Figure FDA0002946034640000012
Figure FDA0002946034640000012
其中,x,y是行人框标签的中点坐标,w,h是行人框标签的宽度与长度,xa,ya是行人候选框的坐标,wa,ha是行人候选框的宽度与长度;tx为行人框的x坐标相对于标注框x坐标的偏差对应标注框宽度的比例,ty为行人框的y坐标相对于标注框y坐标的偏差对应标注框长度的比例,tw为行人框的宽度相对于标注框宽度的比例,th为行人框的长度相对于标注框长度的比例;Among them, x, y are the coordinates of the midpoint of the pedestrian frame label, w, h are the width and length of the pedestrian frame label, x a , y a are the coordinates of the pedestrian candidate frame, w a , h a are the width of the pedestrian candidate frame and Length; t x is the ratio of the deviation of the x-coordinate of the pedestrian frame relative to the x-coordinate of the annotation frame to the width of the annotation frame, t y is the ratio of the deviation of the y-coordinate of the pedestrian frame to the y-coordinate of the annotation frame corresponding to the length of the annotation frame, t w is the ratio of the width of the pedestrian frame to the width of the annotation frame, and th is the ratio of the length of the pedestrian frame to the length of the annotation frame; S22、根据行人数据集Xtrain和行人位置P建模表观语义:S22. Model the apparent semantics according to the pedestrian dataset X train and the pedestrian position P: s=<w,d>s=<w,d>
Figure FDA0002946034640000013
Figure FDA0002946034640000013
其中s表示特征d在投影向量w上的投影值,w是行人权重投影向量,d是行人特征描述子,<.,.>是内积运算符,p(C=k|d)是softmax函数,表示属于第k类的概率值;sj为特征d在第j个投影向量w上的投影值;C为取值个数为k的离散随机变量;j为全部投影向量w的第j个w的索引;where s represents the projection value of feature d on the projection vector w, w is the pedestrian reprojection vector, d is the pedestrian feature descriptor, <., .> is the inner product operator, p(C=k|d) is the softmax function , represents the probability value belonging to the kth class; s j is the projection value of the feature d on the jth projection vector w; C is a discrete random variable with k values; j is the jth of all projection vectors w the index of w; S3、根据步骤S2中的建模结果建立行人多层刺激网络模型,具体包括:S3. Establish a pedestrian multi-layer stimulation network model according to the modeling result in step S2, which specifically includes: S31、建立多层刺激卷积神经网络,神经网络的输入为一幅监控图像X和行人标注框B,输出为对应行人候选框的概率值p,以及X中的行人位置偏差O;神经网络的结构表示为映射X→(p,O);S31. Establish a multi-layer stimulation convolutional neural network. The input of the neural network is a monitoring image X and a pedestrian marking frame B, and the output is the probability value p of the corresponding pedestrian candidate frame, and the pedestrian position deviation O in X; The structure is represented as the mapping X → (p, O); S32、子映射X→p使用软最大Softmax损失函数,表示为S32, the submap X→p uses the softmax Softmax loss function, which is expressed as
Figure FDA0002946034640000021
Figure FDA0002946034640000021
Lcls(X,Y;θ)=-∑jYjlog p(C|d) 公式(3)L cls (X, Y; θ) = -∑ j Y j log p(C|d) Formula (3) 其中Y是二值向量,如果属于第k类,对应值为1,其余为0;Lcls(X,Y;θ)表示整个训练数据集的softmax损失函数;Where Y is a binary vector, if it belongs to the kth class, the corresponding value is 1, and the rest are 0; L cls (X, Y; θ) represents the softmax loss function of the entire training data set; S33、子映射X→O使用欧几里得损失函数,表示为S33. The submap X→O uses the Euclidean loss function, which is expressed as Lloc(t,v)=∑ismooth(ti,vi)L loc (t, v)=∑ i smooth(t i , v i )
Figure FDA0002946034640000022
Figure FDA0002946034640000022
其中ti是行人位置偏差标签,vi是行人位置偏差预测值;i表示第i个训练样本;where t i is the pedestrian position deviation label, v i is the predicted value of pedestrian position deviation; i represents the ith training sample; S34、整个多层刺激神经网络的损失函数为S34, the loss function of the entire multilayer stimulation neural network is L=Lcls+Lloc 公式(5)L=L cls +L loc formula (5) 使用随机梯度下降和反向传播算法在损失函数L下训练整个神经网络;Train the entire neural network under the loss function L using stochastic gradient descent and backpropagation algorithms; 所述多层刺激神经网络用3个RPN网络来提取候选框,每个RPN网络利用的特征信息不同,从而得到的候选框大小及尺度也不一样,且每个RPN网络均引入一个损失函数L;检测过程中先得到每个RPN网络提取的候选框,按照各自置信度大小过滤得到300个候选区域;然后将3个RPN网络中的候选区域合并,得到900个候选区域;接着按照分类置信度从大到小排列,过滤得到最终的300个目标候选区域;依据其输出的候选框分类概率值是否大于给定阈值过滤候选框,同时采用非极大值抑制算法来消除交叉重复的检测框,最后根据预测的位置偏差O校正得到行人位置P;The multi-layer stimulation neural network uses three RPN networks to extract candidate frames. Each RPN network uses different feature information, so the size and scale of the obtained candidate frames are also different, and each RPN network introduces a loss function L. ; In the detection process, the candidate frames extracted by each RPN network are first obtained, and 300 candidate regions are obtained by filtering according to their respective confidence levels; then the candidate regions in the three RPN networks are combined to obtain 900 candidate regions; then according to the classification confidence level Arrange from large to small, filter to obtain the final 300 target candidate regions; filter the candidate boxes according to whether the output candidate box classification probability value is greater than the given threshold, and use the non-maximum value suppression algorithm to eliminate overlapping detection boxes. Finally, the pedestrian position P is obtained by correcting the predicted position deviation O; S4、使用所述检测模型检测监控图像中的行人位置;其中检测监控图像中的行人位置包括:将待检测的监控图像X输入训练好的神经网络,依据其输出的候选框概率值判断是否为行人,最后根据预测的位置偏差O校正得到行人位置P。S4, using the detection model to detect the pedestrian position in the monitoring image; wherein detecting the pedestrian position in the monitoring image includes: inputting the monitoring image X to be detected into the trained neural network, and judging whether it is a candidate frame probability value according to the output candidate frame probability value. Pedestrians, and finally the pedestrian position P is obtained by correcting the predicted position deviation O.
CN201710385952.3A 2017-05-26 2017-05-26 A Pedestrian Detection Method Based on Deep Learning Multi-layer Stimulation Active CN107301376B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710385952.3A CN107301376B (en) 2017-05-26 2017-05-26 A Pedestrian Detection Method Based on Deep Learning Multi-layer Stimulation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710385952.3A CN107301376B (en) 2017-05-26 2017-05-26 A Pedestrian Detection Method Based on Deep Learning Multi-layer Stimulation

Publications (2)

Publication Number Publication Date
CN107301376A CN107301376A (en) 2017-10-27
CN107301376B true CN107301376B (en) 2021-04-13

Family

ID=60138099

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710385952.3A Active CN107301376B (en) 2017-05-26 2017-05-26 A Pedestrian Detection Method Based on Deep Learning Multi-layer Stimulation

Country Status (1)

Country Link
CN (1) CN107301376B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110163224B (en) * 2018-01-23 2023-06-20 天津大学 An Online Learning Auxiliary Data Labeling Method
CN108537117B (en) * 2018-03-06 2022-03-11 哈尔滨思派科技有限公司 Passenger detection method and system based on deep learning
CN108446662A (en) * 2018-04-02 2018-08-24 电子科技大学 A Pedestrian Detection Method Based on Semantic Segmentation Information
CN110969657B (en) * 2018-09-29 2023-11-03 杭州海康威视数字技术股份有限公司 Gun ball coordinate association method and device, electronic equipment and storage medium
CN111178267A (en) * 2019-12-30 2020-05-19 成都数之联科技有限公司 Video behavior identification method for monitoring illegal fishing
CN111476089B (en) * 2020-03-04 2023-06-23 上海交通大学 Pedestrian detection method, system and terminal for multi-mode information fusion in image
CN111523478B (en) * 2020-04-24 2023-04-28 中山大学 A Pedestrian Image Detection Method for Target Detection System

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106022237A (en) * 2016-05-13 2016-10-12 电子科技大学 Pedestrian detection method based on end-to-end convolutional neural network
CN106250812A (en) * 2016-07-15 2016-12-21 汤平 A kind of model recognizing method based on quick R CNN deep neural network

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10685262B2 (en) * 2015-03-20 2020-06-16 Intel Corporation Object recognition based on boosting binary convolutional neural network features
US20170098162A1 (en) * 2015-10-06 2017-04-06 Evolv Technologies, Inc. Framework for Augmented Machine Decision Making

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106022237A (en) * 2016-05-13 2016-10-12 电子科技大学 Pedestrian detection method based on end-to-end convolutional neural network
CN106250812A (en) * 2016-07-15 2016-12-21 汤平 A kind of model recognizing method based on quick R CNN deep neural network

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection;Zhaowei Cai et al.;《European Conference on Computer Vision》;20160917;第354-358页 *
Deep Convolutional Neural Networks for Pedestrian Detection with Skip Pooling;Jie Liu et al.;《 2017 International Joint Conference on Neural Networks》;20170519;第1-9页 *
Fast R-CNN;Ross Girshick;《arXiv:1504.08083v2》;20150927;第2056-2063页 *
R-FCN: Object Detection via Region-based Fully Convolutional Networks;Jifeng Dai et al.;《arXiv:1605.06409v2》;20160621;第4页 *
基于特征共享的高效物体检测;任少卿;《中国博士学位论文全文数据库 信息科技辑》;20160815;第2016年卷(第8期);第四章 *

Also Published As

Publication number Publication date
CN107301376A (en) 2017-10-27

Similar Documents

Publication Publication Date Title
CN107301376B (en) A Pedestrian Detection Method Based on Deep Learning Multi-layer Stimulation
CN109492581B (en) A Human Action Recognition Method Based on TP-STG Framework
Zhou et al. Safety helmet detection based on YOLOv5
CN110147743B (en) A real-time online pedestrian analysis and counting system and method in complex scenes
CN105550678B (en) Human action feature extracting method based on global prominent edge region
Wang et al. Actionness estimation using hybrid fully convolutional networks
CN113408584B (en) RGB-D multi-modal feature fusion 3D target detection method
CN103971386B (en) A kind of foreground detection method under dynamic background scene
CN104601964B (en) Pedestrian target tracking and system in non-overlapping across the video camera room of the ken
CN110796057A (en) Pedestrian re-identification method and device and computer equipment
CN108416266A (en) A kind of video behavior method for quickly identifying extracting moving target using light stream
Liu et al. D-CenterNet: An anchor-free detector with knowledge distillation for industrial defect detection
CN111860171A (en) A method and system for detecting irregularly shaped targets in large-scale remote sensing images
CN108647625A (en) A kind of expression recognition method and device
CN107909027A (en) It is a kind of that there is the quick human body target detection method for blocking processing
CN106570480A (en) Posture-recognition-based method for human movement classification
CN115527072A (en) Chip surface defect detection method based on sparse space perception and meta-learning
Pang et al. Dance video motion recognition based on computer vision and image processing
CN105404894A (en) Target tracking method used for unmanned aerial vehicle and device thereof
CN110458022A (en) A self-learning target detection method based on domain adaptation
CN114170686A (en) Elbow bending behavior detection method based on human body key points
CN108256462A (en) A kind of demographic method in market monitor video
CN106815563B (en) A Crowd Prediction Method Based on Human Apparent Structure
CN108229524A (en) A kind of chimney and condensing tower detection method based on remote sensing images
CN107992854A (en) Forest Ecology man-machine interaction method based on machine vision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant