[go: up one dir, main page]

CN116626701B - A lidar target detection method based on spatiotemporal attention mechanism - Google Patents

A lidar target detection method based on spatiotemporal attention mechanism

Info

Publication number
CN116626701B
CN116626701B CN202310355420.0A CN202310355420A CN116626701B CN 116626701 B CN116626701 B CN 116626701B CN 202310355420 A CN202310355420 A CN 202310355420A CN 116626701 B CN116626701 B CN 116626701B
Authority
CN
China
Prior art keywords
sampling
target
laser radar
point cloud
target detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310355420.0A
Other languages
Chinese (zh)
Other versions
CN116626701A (en
Inventor
余杰
余昊
朱亚坤
刘义军
胡天人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dongfeng Motor Group Co Ltd
Original Assignee
Dongfeng Motor Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dongfeng Motor Group Co Ltd filed Critical Dongfeng Motor Group Co Ltd
Priority to CN202310355420.0A priority Critical patent/CN116626701B/en
Publication of CN116626701A publication Critical patent/CN116626701A/en
Application granted granted Critical
Publication of CN116626701B publication Critical patent/CN116626701B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Electromagnetism (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Optical Radar Systems And Details Thereof (AREA)

Abstract

本发明公开了一种基于时空注意力机制的激光雷达目标检测方法,包括模型训练阶段和目标检测阶段。在模型训练阶段中,通过多次循环进行模型权重更新,得到最终模型权重,其中包含注意力权重矩阵,从而提高了目标检测的准确性;在目标检测阶段中,利用注意力权重矩阵对重点目标区域进行特征提取,根据特征提取结果对当前帧激光雷达点云采样,从而实现目标检测。本发明基于注意力机制,考虑到检测目标在空间中的变化情况,提高了采样的速度和精度。

This invention discloses a laser radar target detection method based on a spatiotemporal attention mechanism, comprising a model training phase and a target detection phase. During the model training phase, model weights are updated through multiple cycles to obtain final model weights, which include an attention weight matrix, thereby improving target detection accuracy. During the target detection phase, the attention weight matrix is used to extract features from key target areas, and the laser radar point cloud of the current frame is sampled based on the feature extraction results, thereby achieving target detection. Based on the attention mechanism, the present invention takes into account the spatial variations of the detection target, thereby improving sampling speed and accuracy.

Description

Laser radar target detection method based on space-time attention mechanism
Technical Field
The invention relates to the field of laser radar target detection, in particular to a laser radar target detection method based on a space-time attention mechanism.
Background
At present, a high-order intelligent driving technology is rapidly developed, and along with the increasing technical maturity of a laser radar sensing sensor, target sensing based on laser radar point cloud becomes an important component of the high-order intelligent driving. The 3D target perception based on the laser radar point cloud mainly comprises two modes, namely, filtering, ground segmentation and point cloud clustering of the original laser radar point cloud to realize the perception of a 3D target, wherein the traditional laser radar target perception method has the defects that specific categories of the target cannot be identified, clustering results are easily influenced by point cloud noise and ground segmentation effects, and a convolutional neural network technology, such as PointPillars and 3DSSD, is utilized to build a neural network model based on the 3D target perception of the laser radar point cloud, tens of thousands of laser radar point cloud data are acquired, target labeling is carried out, and target related characteristics are detected through model training. The convolutional neural network model-based target perception is realized, the accuracy of target perception can be greatly improved, and good effects can be obtained in different scenes, so that the method is the most widely used technology in intelligent driving target perception at present.
The prior art discloses a 4D millimeter wave three-dimensional target detection method based on a self-attention mechanism, which comprises the steps of collecting 4D millimeter wave Lei Dadian cloud data in real time and preprocessing, inputting the preprocessed 4D millimeter wave radar point cloud data into a pre-trained three-dimensional target detection model to output a target detection result, wherein the three-dimensional target detection model comprises a bird's eye view voying module, a stand column self-attention feature extraction module, a CNN trunk network and a PRN detection head, the bird's eye view voying module is used for voying the 4D millimeter wave Lei Dadian cloud data in a bird's eye view perspective to extract feature information F of the whole space, the stand column self-attention feature extraction module is used for extracting point cloud global features based on the self-attention mechanism to generate a BEV pseudo image, the CNN trunk network is used for extracting features of the BEV pseudo image to output a feature map, and the PRN detection head is used for carrying out target detection on the feature map to output a 3D target detection result.
The first disadvantage of the prior art includes, 1) the aerial view voxelization module loses part of millimeter wave Lei Dadian cloud information in the down-sampling process of generating the upright post, so that the subsequent target detection result cannot reach a higher level. 2) Attention characteristic extraction is performed based on the segmented upright columns, the prior information effect of the current frame target detection result on the next frame point cloud is not utilized, and therefore the attention characteristic extraction has limited detection precision and speed improvement.
The second prior art discloses a three-dimensional target detection method for 4D millimeter wave and laser point cloud multi-view feature fusion, which comprises the steps of collecting millimeter wave Lei Dadian cloud data and laser radar point cloud data simultaneously, inputting the millimeter wave Lei Dadian cloud data and the laser radar point cloud data into a pre-established and trained millimeter wave and laser radar fusion network, and outputting a three-dimensional target detection result, wherein the millimeter wave and laser radar fusion network is used for respectively learning interaction information of a laser radar and a millimeter wave radar from a BEV view angle, learning interaction information of the laser radar and the millimeter wave radar from a perspective view angle, and splicing the interaction information, so that fusion of the millimeter wave radar point cloud data and the laser radar point cloud data is realized.
The second disadvantage of the prior art comprises 1) that the voxelization module outputs Pillar characteristics of millimeter wave Lei Dadian cloud data and Pillar characteristics of laser radar point cloud data, and then converts the characteristics into BEV views for target detection, and the detection accuracy and speed are limited by the size of the partitioned villar characteristics. 2) The effect of detecting multi-scale targets under a single-scale feature map is poor.
Disclosure of Invention
The invention aims to provide a laser radar target detection method based on a space-time attention mechanism, so as to improve the target detection precision and the target detection rate of a laser radar.
In order to solve the technical problems, the invention provides a technical scheme that the laser radar target detection method based on a space-time attention mechanism builds a target detection model, and comprises a model training stage and a target detection stage;
in the model training stage, the data set subjected to target labeling is used for training, the following steps are carried out on the data set frame by frame,
S101, if the current frame is a first frame laser radar point cloud, sampling the first frame laser radar point cloud according to a certain sampling rule to obtain a sampling initial seed set P seed, if the current frame is a non-first frame laser radar point cloud, sampling the current frame laser radar point cloud according to a convolution result of the attention force F RIO of the important point target area by utilizing the attention force F RIO comprising a plurality of important point target areas obtained in S107 to obtain a sampling initial seed set P seed;
s102, carrying out fusion sampling on a sampling initial seed set P seed to obtain a characteristic point cloud P FS;
s103, mapping the characteristic point cloud P FS to the BEV view;
S104, constructing a backbone network of a multi-scale feature pyramid based on an FPN (fuzzy neural network), inputting a BEV view M bev into the backbone network, and extracting feature graphs with different scales through convolution operation;
s105, constructing a multi-scale detection head by using a full-connection layer under the characteristic diagram of each scale;
s106, selecting a next frame of laser radar point cloud;
S107, calculating a rotation transformation matrix T by utilizing motion transformation according to the motion information v of the laser radar carrier and the time interval T between the current frame and the previous adjacent frame, and then extracting key target areas in the laser radar point cloud of the current frame according to the target pose information p and the dimension information S in the target detection result of S105 and combining the obtained rotation transformation matrix T, so as to form an attention map F RIO corresponding to a plurality of target key target areas;
S108, returning to S101;
In the model training stage, S101-S108 are used for carrying out cyclic iteration, loss is calculated according to a target detection result obtained by a target detection model and a target labeling true value in a dataset for training, model weight updating is carried out for a plurality of times, and a final model weight is obtained, wherein the final model weight comprises an attention weight matrix F w;
the target detection stage configures a target detection model according to the final model weight obtained in the model training stage, the target detection model performs the following steps in the target detection stage,
S201, if the current frame is a first frame laser radar point cloud, sampling the first frame laser radar point cloud according to a certain sampling rule to obtain a sampling initial seed set P seed, if the current frame is a non-first frame laser radar point cloud, extracting features of a heavy target area F according to attention force diagram F RIO containing a plurality of key target areas obtained in S207, and sampling the current frame laser radar point cloud according to a feature extraction result by using an attention weight matrix F w obtained in a model training stage to obtain a sampling initial seed set P seed;
Subsequently, S202-S207 and S202-S207 are sequentially executed, wherein the specific operation steps are the same as those of S102-107;
And finally, executing S208 and returning to S201.
According to the scheme, the data set adopted in the model training stage is marked with targets including motor vehicles, non-motor vehicles and pedestrians.
According to the above scheme, in S101, when the current frame is a non-initial frame lidar point cloud, the sampling initial seed set P seed is obtained by convolving the attention map F RIO of the heavy target area to obtain the feature F out,
Fout=Conv(FRIO)
Then F out and the current frame point cloud set P get the sampling initial seed set index S index,
Sindex=Fout*P
Then extracting a sampling initial seed set P seed from the current frame point cloud set P according to the sampling initial seed set index S index,
Pseed=P[Sindex]。
According to the above scheme, the fusion sampling in S102 specifically includes sampling the sampling initial seed set P seed by using the furthest point sampling based on the euclidean distance and the furthest point sampling based on the feature distance in the 3DSSD backbone network to obtain sampling results Ld (P) and Lf (P), respectively, and then obtaining the fusion sampling result P FS of the sampling initial seed set P seed based on the fusion policy C (P), where the fusion policy C (P) is,
C(P)=Ld(P)+Lf(P)。
The model training phase calculates the loss using the loss function in PointPillar, as per the scheme described above.
According to the scheme, the model weight is updated by using an Adam optimizer in the model training stage.
The beneficial effects of the invention are as follows:
1. The method is used for extracting the target area and sampling seed set based on a space-time attention mechanism, so that the sampling precision and speed are improved, rich characteristic point clouds are included, the resolution of the BEV view can be improved, the additional calculation cost is not increased, and the high-resolution BEV view can improve the target detection precision.
2. The detection rate of target detection can be improved based on the multi-scale feature pyramid backbone network and the multi-scale detection head, and targets with different scales are covered.
Drawings
FIG. 1 is a flow chart of a method for detecting a target of a lidar based on a spatio-temporal attention mechanism according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a backbone network architecture of a multi-scale feature pyramid in accordance with one embodiment of the present invention.
In fig. 2, conv1, conv2, and conv3 are respectively multi-layer convolutional neural networks, and M L、MM、MS is a feature map for detecting a large-scale target, a middle-scale target, and a small-scale target, respectively.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present disclosure. It will be apparent that the described embodiments are some, but not all, of the embodiments of the present disclosure. All other embodiments, which can be made by one of ordinary skill in the art without the need for inventive faculty, are within the scope of the present disclosure, based on the described embodiments of the present disclosure.
A laser radar target detection method based on a space-time attention mechanism is provided, and a target detection model is built by the method, wherein the method comprises a model training stage and a target detection stage;
the data set used for training in the model training stage is laser radar point cloud which is acquired in the actual scene and is labeled by a frame-by-frame target, the labeled target comprises motor vehicles, non-motor vehicles and pedestrians, the target detection model is implemented in the model training stage as follows,
S101, if the current frame is a first frame laser radar point cloud, calculating to obtain the center of gravity of the first frame laser radar point cloud, selecting the point farthest from the center of gravity in the first frame laser radar point cloud as a sampling initial seed set P seed, if the current frame is a non-first frame laser radar point cloud, convolving the attention map F RIO of the heavy point target area according to the attention map F RIO containing a plurality of key target areas obtained in S107 to obtain a characteristic F out, namely F out=Conv(FRIO), then obtaining a sampling initial seed set index S index, particularly S index=Fout P according to the F out and the current frame point cloud set P, and then extracting a sampling initial seed set P seed, namely P seed=P[Sindex, from the current frame point cloud set P according to the sampling initial seed set index S index;
S102, sampling a sampling initial seed set P seed by using the furthest point sampling based on Euclidean distance and the furthest point sampling based on characteristic distance in a 3DSSD backbone network to respectively obtain sampling results Ld (P) and Lf (P), obtaining a fusion sampling result of the sampling initial seed set P seed based on a fusion strategy C (P), namely obtaining a characteristic point cloud P FS, wherein the size of the characteristic point cloud is (N m,Cm), N m is the number of point clouds after fusion sampling, and C m is the number of point cloud characteristic channels after fusion sampling;
S103, mapping the characteristic point cloud P FS to the BEV view with a certain mapping resolution according to the (X, Y) coordinate information of the characteristic point cloud P FS (the mapping resolution f=0.10m in the embodiment, considering that the original laser radar point cloud is subjected to fusion sampling processing, the mapping resolution can improve the precision of the BEV view), generating a BEV view M bev with the size of (H, W, C m), wherein H, W respectively represents the length and the width of the BEV view, the pixel value M bev(i,j)=max(Pcell of M bev, wherein i and j are coordinate values of pixels in the BEV view, and P cell is a point cloud set in the pixel range of a certain pixel;
S104, constructing a backbone network of a multi-scale feature pyramid based on an FPN (fuzzy neural network), inputting a BEV view M bev into the backbone network, and extracting feature graphs with different scales through convolution operation;
S105, constructing a multi-scale detection head by using a full-connection layer under the characteristic diagram of each scale, carrying out target detection on the laser radar point cloud of the current frame by using the multi-scale detection head, and obtaining a target detection result after NMS processing;
s106, selecting a next frame of laser radar point cloud;
S107, calculating a rotation transformation matrix T by utilizing motion transformation according to the motion information v of the laser radar carrier and the time interval T between the current frame and the previous adjacent frame, and then extracting key target areas in the laser radar point cloud of the current frame according to the target pose information p and the dimension information S in the target detection result of S105 and combining the obtained rotation transformation matrix T, so as to form an attention map F RIO corresponding to a plurality of target key target areas;
S108, returning to S101;
In the model training stage, S101-S108 are used for carrying out cyclic iteration, a target detection result obtained by a target detection model and a target annotation true value in a data set for training are subjected to loss calculation through a loss function in PointPillar, an Adam optimizer is used for carrying out model weight updating, and a final model weight is obtained after multiple cyclic iterations, wherein the final model weight comprises an attention weight matrix F w;
the target detection stage configures a target detection model according to the final model weight obtained in the model training stage, the target detection model performs the following steps in the target detection stage,
S201, if the current frame is a first frame laser radar point cloud, calculating to obtain the center of gravity of the first frame laser radar point cloud, selecting the point farthest from the center of gravity in the first frame laser radar point cloud as a sampling initial seed set P seed, if the current frame is a non-first frame laser radar point cloud, extracting features of an attention map F RIO of a heavy point target area by using an attention weight matrix F w obtained in a model training stage according to an attention map F RIO containing a plurality of key target areas obtained in S207, obtaining F out=Fw(FRIO), obtaining a sampling initial seed set index S index, specifically S index=Fout is adopted as P according to the F out and the current frame point cloud set P, and extracting a sampling initial seed set P seed, namely P seed=P[Sindex, from the current frame point cloud set P according to the sampling initial seed set index S index;
Subsequently, S202-S207 and S202-S207 are sequentially executed, wherein the specific operation steps are the same as those of S102-107;
And finally, executing S208 and returning to S201.
The foregoing description is only illustrative of the present invention and is not intended to limit the scope of the invention, and all equivalent structures or equivalent processes or direct or indirect application in other related technical fields are included in the scope of the present invention.

Claims (6)

1. A laser radar target detection method based on a space-time attention mechanism is characterized in that a target detection model is built by the method, and the method comprises a model training stage and a target detection stage;
in the model training stage, the data set subjected to target labeling is used for training, the following steps are carried out on the data set frame by frame,
S101, if the current frame is a first frame laser radar point cloud, sampling the first frame laser radar point cloud according to a certain sampling rule to obtain a sampling initial seed set P seed, if the current frame is a non-first frame laser radar point cloud, sampling the current frame laser radar point cloud according to a convolution result of the attention force F RIO of the important point target area by utilizing the attention force F RIO comprising a plurality of important point target areas obtained in S107 to obtain a sampling initial seed set P seed;
s102, carrying out fusion sampling on a sampling initial seed set P seed to obtain a characteristic point cloud P FS;
s103, mapping the characteristic point cloud P FS to the BEV view;
S104, constructing a backbone network of a multi-scale feature pyramid based on an FPN (fuzzy neural network), inputting a BEV view M bev into the backbone network, and extracting feature graphs with different scales through convolution operation;
s105, constructing a multi-scale detection head by using a full-connection layer under the characteristic diagram of each scale;
s106, selecting a next frame of laser radar point cloud;
S107, calculating a rotation transformation matrix T by utilizing motion transformation according to the motion information v of the laser radar carrier and the time interval T between the current frame and the previous adjacent frame, and then extracting key target areas in the laser radar point cloud of the current frame according to the target pose information p and the dimension information S in the target detection result of S105 and combining the obtained rotation transformation matrix T, so as to form an attention map F RIO corresponding to a plurality of target key target areas;
S108, returning to S101;
In the model training stage, S101-S108 are used for carrying out cyclic iteration, loss is calculated according to a target detection result obtained by a target detection model and a target labeling true value in a dataset for training, model weight updating is carried out for a plurality of times, and a final model weight is obtained, wherein the final model weight comprises an attention weight matrix F w;
the target detection stage configures a target detection model according to the final model weight obtained in the model training stage, the target detection model performs the following steps in the target detection stage,
S201, if the current frame is a first frame laser radar point cloud, sampling the first frame laser radar point cloud according to a certain sampling rule to obtain a sampling initial seed set P seed, if the current frame is a non-first frame laser radar point cloud, extracting features of a heavy target area F according to attention force diagram F RIO containing a plurality of key target areas obtained in S207, and sampling the current frame laser radar point cloud according to a feature extraction result by using an attention weight matrix F w obtained in a model training stage to obtain a sampling initial seed set P seed;
Subsequently, S202-S207 and S202-S207 are sequentially executed, wherein the specific operation steps are the same as those of S102-107;
And finally, executing S208 and returning to S201.
2. The method for detecting laser radar targets based on the space-time attention mechanism of claim 1, wherein the data set used in the model training stage is labeled targets including motor vehicles, non-motor vehicles and pedestrians.
3. The method for detecting a target of a lidar based on a space-time attention mechanism of claim 1, wherein in S101, when the current frame is a non-initial frame lidar point cloud, the initial seed set P seed is obtained by convolving an attention map F RIO of a heavy target area to obtain a feature F out,
Fout=Conv(FRIO)
Then F out and the current frame point cloud set P get the sampling initial seed set index S index,
Sindex=Fout*P
Then extracting a sampling initial seed set P seed from the current frame point cloud set P according to the sampling initial seed set index S index,
Pseed=P[Sindex]。
4. The method for detecting a laser radar target based on a space-time attention mechanism according to claim 1, wherein the fused sampling in S102 is specifically that sampling initial seed set P seed is sampled by using the furthest point sampling based on Euclidean distance and the furthest point sampling based on characteristic distance in a 3DSSD backbone network respectively to obtain sampling results LD (P) and Lf (P), then fused sampling result P FS of sampling initial seed set P seed is obtained based on fusion strategy C (P), and the fusion strategy C (P) is that,
C(P)=Ld(P)+Lf(P)。
5. The method for detecting a lidar target based on a spatio-temporal concentration mechanism of claim 1 wherein the model training phase calculates the loss using a loss function in PointPillar.
6. The method for detecting the laser radar target based on the space-time attention mechanism of claim 1, wherein the model training stage utilizes an Adam optimizer to update the model weights.
CN202310355420.0A 2023-04-04 2023-04-04 A lidar target detection method based on spatiotemporal attention mechanism Active CN116626701B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310355420.0A CN116626701B (en) 2023-04-04 2023-04-04 A lidar target detection method based on spatiotemporal attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310355420.0A CN116626701B (en) 2023-04-04 2023-04-04 A lidar target detection method based on spatiotemporal attention mechanism

Publications (2)

Publication Number Publication Date
CN116626701A CN116626701A (en) 2023-08-22
CN116626701B true CN116626701B (en) 2025-09-30

Family

ID=87608851

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310355420.0A Active CN116626701B (en) 2023-04-04 2023-04-04 A lidar target detection method based on spatiotemporal attention mechanism

Country Status (1)

Country Link
CN (1) CN116626701B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11222217B1 (en) * 2020-08-14 2022-01-11 Tsinghua University Detection method using fusion network based on attention mechanism, and terminal device
CN115496746A (en) * 2022-10-20 2022-12-20 复旦大学 Method and system for detecting surface defects of plate based on fusion of image and point cloud data

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114450720A (en) * 2020-08-18 2022-05-06 深圳市大疆创新科技有限公司 Target detection method and device and vehicle-mounted radar
CN114511572A (en) * 2020-10-28 2022-05-17 刘晋浩 Single tree segmentation flow method for forest tree measurement

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11222217B1 (en) * 2020-08-14 2022-01-11 Tsinghua University Detection method using fusion network based on attention mechanism, and terminal device
CN115496746A (en) * 2022-10-20 2022-12-20 复旦大学 Method and system for detecting surface defects of plate based on fusion of image and point cloud data

Also Published As

Publication number Publication date
CN116626701A (en) 2023-08-22

Similar Documents

Publication Publication Date Title
CN109034018B (en) Low-altitude small unmanned aerial vehicle obstacle sensing method based on binocular vision
CN109726627B (en) A neural network model training and detection method for universal ground wire
CN111860514B (en) Multi-category real-time segmentation method for orchard scene based on improvement DeepLab
CN111080659A (en) Environmental semantic perception method based on visual information
CN111060924B (en) A SLAM and Object Tracking Method
CN113936139A (en) A method and system for scene bird's-eye view reconstruction combining visual depth information and semantic segmentation
CN114037640B (en) Image generation method and device
Mahjourian et al. Geometry-based next frame prediction from monocular video
CN113361528B (en) Multi-scale target detection method and system
CN104517103A (en) Traffic sign classification method based on deep neural network
CN115143950B (en) A method for generating local semantic grid maps for intelligent vehicles
EP4174792A1 (en) Method for scene understanding and semantic analysis of objects
CN112861755B (en) Target multi-category real-time segmentation method and system
CN112859011A (en) Method for extracting waveform signals of single-wavelength airborne sounding radar
CN110599521A (en) Method for generating trajectory prediction model of vulnerable road user and prediction method
CN116597122A (en) Data labeling method, device, electronic equipment and storage medium
CN112819832B (en) Fine-grained boundary extraction method for semantic segmentation of urban scenes based on laser point cloud
CN115937704B (en) Remote sensing image road segmentation method based on topology perception neural network
CN117132884A (en) Crop remote sensing intelligent extraction method based on land parcel scale
Lu et al. A lightweight CNN–transformer network with Laplacian loss for low-altitude UAV imagery semantic segmentation
CN116626701B (en) A lidar target detection method based on spatiotemporal attention mechanism
CN119152200B (en) An improved autonomous driving target detection method based on YOLOv8
CN118587686A (en) A lunar obstacle recognition method based on transformer feature enhancement
Pu et al. Sdf-gan: Semi-supervised depth fusion with multi-scale adversarial networks
CN117557983A (en) Scene reconstruction method and driving assistance system based on depth forward projection and query back projection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant