CN110503073B - Dense multi-agent track prediction method for dynamic link at third view angle - Google Patents
Dense multi-agent track prediction method for dynamic link at third view angle Download PDFInfo
- Publication number
- CN110503073B CN110503073B CN201910807587.XA CN201910807587A CN110503073B CN 110503073 B CN110503073 B CN 110503073B CN 201910807587 A CN201910807587 A CN 201910807587A CN 110503073 B CN110503073 B CN 110503073B
- Authority
- CN
- China
- Prior art keywords
- sampling
- convolution
- trajectory
- gate
- input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/53—Recognition of crowd images, e.g. recognition of crowd congestion
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
本发明公开了一种第三视角下动态链接的密集多智能体轨迹预测方法,利用变分自编码器视觉组件进行数据压缩;输入轨迹帧X进入动态循环单元完成编码网络功能;对于编码的数据进行解码。本发明不仅能根据卷积核采样点的动态变化模拟到多智能体流体时空运动,而且能够提取多智能体所处位置的空间特征,并能根据数据学习到具体在特征图上采样那些像素点,减少了空间特征冗余。本发明采用数据驱动的方式根据固定卷积核在特征图上学习到权重,然后采用sigmoid函数对学习到的权重值操作,得到时空数据的采样幅度,更加符合客观采样规律,提高模型泛化能力。本发明无需采用智能体轨迹点,可以实现多步预测、提高模型泛化能力,减少了计算复杂度。
The invention discloses a dynamic link dense multi-agent trajectory prediction method under the third perspective, which utilizes a variational autoencoder visual component to perform data compression; the input trajectory frame X enters the dynamic cycle unit to complete the encoding network function; for the encoded data to decode. The present invention can not only simulate the space-time movement of the multi-agent fluid according to the dynamic change of the convolution kernel sampling point, but also extract the spatial features of the location of the multi-agent, and learn those pixels that are specifically sampled on the feature map according to the data , reducing spatial feature redundancy. The present invention uses a data-driven method to learn weights on the feature map according to a fixed convolution kernel, and then uses a sigmoid function to operate on the learned weight values to obtain the sampling range of spatio-temporal data, which is more in line with the objective sampling rules and improves the generalization ability of the model . The invention does not need to use intelligent body trajectory points, can realize multi-step prediction, improve model generalization ability, and reduce calculation complexity.
Description
技术领域technical field
本发明涉及一种多智能体轨迹预测技术,特别是一种第三视角下动态链接的密集多智能体轨迹预测方法。The invention relates to a multi-agent trajectory prediction technology, in particular to a dynamic link dense multi-agent trajectory prediction method under a third perspective.
背景技术Background technique
现代社会,密集多智能体运动越来越频繁,如大型演唱会、体育运动、宗教活动、大型集会等。尤其在中国这样一个人口众多的国家中,密集多智能体运动趋势的预测是公共安全研究的紧迫问题之一。显而易见,密集多智能体轨迹预测有助于制定相对应的安全管理策略,设计更好的多智能体分流模式,实时统计密集多智能体的流量、检测密集智能体的异常行为,保障广大公民的人身安全。In modern society, intensive multi-agent activities are becoming more and more frequent, such as large-scale concerts, sports, religious activities, and large-scale gatherings. Especially in a country with a large population like China, the prediction of dense multi-agent movement trends is one of the urgent issues in public security research. Obviously, the trajectory prediction of dense multi-agents is helpful to formulate corresponding security management strategies, design better multi-agent shunt mode, count the traffic of dense multi-agents in real time, detect the abnormal behavior of dense agents, and protect the safety of citizens. Personal safety.
目前对密集多智能体的轨迹预测还是以数据驱动的固定连接的轨迹点预测技术为主。以卷积循环网络结构为例,由于其卷积核的大小固定,采样邻居位置难以发生变化。这种技术不仅难以模拟到多智能体的流体时空运动趋势(比如多智能体聚集与扩散),而且容易出现采样数据冗余性。对于长时间多步预测上,模型泛化能力会大大降低,不仅预测成本高,也浪费人力资源。到目前为止,密集多智能体轨迹预测还存在诸多亟待解决的难题。At present, the trajectory prediction of dense multi-agents is still based on the data-driven fixed-connection trajectory point prediction technology. Taking the convolutional recurrent network structure as an example, since the size of the convolution kernel is fixed, it is difficult to change the position of the sampling neighbors. This technique is not only difficult to simulate the fluid spatiotemporal movement trend of multi-agents (such as multi-agent aggregation and diffusion), but also prone to sampling data redundancy. For long-term multi-step forecasting, the generalization ability of the model will be greatly reduced, not only the cost of forecasting is high, but also human resources are wasted. So far, there are still many problems to be solved urgently in the trajectory prediction of dense multi-agents.
发明内容Contents of the invention
为解决现有技术存在的上述问题,本发明要提出一种能够解决时空数据动态变化并提高模型泛化能力的第三视角下动态链接的密集多智能体轨迹预测方法。In order to solve the above-mentioned problems in the prior art, the present invention proposes a dynamic link dense multi-agent trajectory prediction method under the third perspective that can solve the dynamic changes of spatio-temporal data and improve the model generalization ability.
为了实现上述目的,本发明的技术方案如下:一种第三视角下动态链接的密集多智能体轨迹预测方法,包括以下步骤:In order to achieve the above object, the technical solution of the present invention is as follows: a dense multi-agent trajectory prediction method dynamically linked under the third perspective, comprising the following steps:
A、利用变分自编码器视觉组件进行数据压缩A. Data Compression Using Variational Autoencoder Vision Components
变分自编码器视觉组件把输入具有时序依赖关系的连续轨迹帧放入端到端的神经网络中进行学习、并把轨迹帧数据进行抽象和压缩。具体步骤如下:The variational autoencoder vision component puts the input continuous trajectory frames with temporal dependencies into the end-to-end neural network for learning, and abstracts and compresses the trajectory frame data. Specific steps are as follows:
A1、输入的连续轨迹帧X1,X2,...,Xt-1,Xt具有不同的尺度,采用函数nn.imsize(X,128,128)调整到同一尺寸128×128,其中nn代表神经网络函数基类名称。A1. The input continuous trajectory frames X 1 , X 2 ,...,X t-1 , X t have different scales, and use the function nn.imsize(X,128,128) to adjust to the same size 128×128, where nn represents Neural network function base class name.
A2、对调整后的连续轨迹帧采用神经网络的全连接操作,编码成向量V,使之前的128×128高纬度变成400维度。如下式所示:A2. The adjusted continuous trajectory frames are fully connected by the neural network and encoded into a vector V, so that the previous high latitude of 128×128 becomes 400 dimensions. As shown in the following formula:
V=nn.Linear(X,400) (1)V=nn.Linear(X,400) (1)
A3、对向量V进行二维卷积操作,进行下采样,利用神经网络将向量V分别拟合成符合高斯分布的低维向量均值μ和方差δ,具体公式如下:A3. Perform a two-dimensional convolution operation on the vector V, perform down-sampling, and use the neural network to fit the vector V into a low-dimensional vector mean μ and variance δ that conform to the Gaussian distribution. The specific formula is as follows:
μ=nn.Conv2d(V) (2)μ=nn.Conv2d(V) (2)
δ=nn.Conv2d(V) (3)δ=nn.Conv2d(V) (3)
A4、使用重采样技巧,得知从N(μ,δ2)中采样一个轨迹帧X,相当于从标准正态分布N(0,1)采样一个ε,然后让X=μ+ε×δ;于是,将从原先N(μ,δ2)中采样变换到了从标准高斯分布N(0,1)中采样,再通过参数变换得到从N(μ,δ2)中采样的结果。A4. Using the resampling technique, we know that sampling a trajectory frame X from N(μ,δ 2 ) is equivalent to sampling an ε from the standard normal distribution N(0,1), and then let X=μ+ε×δ ; Therefore, the sampling from the original N(μ,δ 2 ) is transformed into sampling from the standard Gaussian distribution N(0,1), and then the result of sampling from N(μ,δ 2 ) is obtained through parameter transformation.
B、输入轨迹帧X进入动态循环单元完成编码网络功能B. The input trajectory frame X enters the dynamic cycle unit to complete the encoding network function
在进行编码网络结构之后,编码后的轨迹帧X进入动态循环单元,编码后的轨迹帧向量特征提取在动态循环单元流程中完成,具体步骤如下公式(4):After encoding the network structure, the encoded trajectory frame X enters the dynamic recurrent unit, and the vector feature extraction of the encoded trajectory frame is completed in the dynamic recurrent unit process. The specific steps are as follows formula (4):
其中,代表哈达玛乘积操作,“*”代表卷积操作。Wxz、Wxr分别为更新门隐变量卷积权重、更新门输入卷积权重、重置门隐变量卷积权重、新隐变量卷积权重、重置门输入卷积权重。下标hz、xz、hr、hh、xz分别表示权重W所属为更新门隐变量卷积权重、更新门输入卷积权重、重置门隐变量卷积权重、新隐变量卷积权重、重置门输入卷积权重。实际网络结构运行时,在编码网络中所有权值共享。Ht-1代表t-1时刻的隐变量,k代表采样第几个链接的邻居时空数据点。in, Represents the Hadamard product operation, and "*" represents the convolution operation. W xz , W xr are respectively update gate hidden variable convolution weights, update gate input convolution weights, reset gate hidden variable convolution weights, new hidden variable convolution weights, and reset gate input convolution weights. The subscripts hz, xz, hr, hh, and xz indicate that the weight W belongs to update gate hidden variable convolution weight, update gate input convolution weight, reset gate hidden variable convolution weight, new hidden variable convolution weight, reset Gate input convolution weights. When the actual network structure is running, all values are shared in the encoded network. H t-1 represents the hidden variable at time t-1, and k represents the neighbor spatio-temporal data point of which link is sampled.
B1、变分自编码器视觉组件处理的输入连续轨迹帧首先进入更新门Zt,在传统门控循环单元更新门的基础上增加动态链接功能,用φ(P)来实现,P代表卷积核在特征图中采样位置。φ(P)的具体实现过程是在输入连续轨迹帧X的基础上用大小为3×3卷积神经网络获取时空数据的位置偏移,然后在原来输入特征图的坐标基础上加上位置偏移,通过双线性差值求出偏移后的坐标对应像素值。最后用3×3的卷积核对在变化后的位置上的值进行卷积操作。更新门用于控制前一时刻的状态信息被带入当前状态中的程度。函数Γ(Ht-1,φ(P))功能是动态选择时空数据采样点。B1. The input continuous trajectory frame processed by the visual component of the variational autoencoder first enters the update gate Z t , and the dynamic link function is added on the basis of the update gate of the traditional gated recurrent unit, which is realized by φ(P), and P stands for convolution The kernel samples locations in the feature map. The specific implementation process of φ(P) is to use a 3×3 convolutional neural network to obtain the position offset of the spatio-temporal data on the basis of inputting the continuous trajectory frame X, and then add the position offset to the coordinates of the original input feature map. shift, and calculate the pixel value corresponding to the shifted coordinates through the bilinear difference. Finally, a 3×3 convolution kernel is used to perform a convolution operation on the value at the changed position. The update gate is used to control the degree to which state information from the previous moment is brought into the current state. The function Γ(H t-1 ,φ(P)) is to dynamically select sampling points of spatio-temporal data.
B2、输入连续轨迹帧在更新门操作之后进入重置门,重置门同样采用φ(P)来实现动态链接功能。B2. The input continuous trajectory frame enters the reset gate after the update gate operation, and the reset gate also uses φ(P) to realize the dynamic link function.
B3、在更新门和重置门以后,确定当前时刻的隐变量。当前时刻的隐变量的采样幅度由Δmk确定。Δmk是对时空数据(Ht-1,Xt)首先采用卷积操作获取中间值,然后采用sigmoid函数确定采样值概率,范围为[0,1]。具体公式如下:B3. After updating the gate and resetting the gate, determine the hidden variable at the current moment. The sampling range of the hidden variable at the current moment is determined by Δm k . Δm k is to use the convolution operation to obtain the intermediate value of the space-time data (H t-1 , X t ), and then use the sigmoid function to determine the probability of the sampling value, and the range is [0,1]. The specific formula is as follows:
C、对于编码的数据进行解码C. Decode the encoded data
C1、根据前J个具有输入时序轨迹图像帧X观察结果,预测未来最可能的K个时序轨迹图像帧X序列;C1. Predict the most likely sequence of K time-series trajectory image frames X in the future according to the observation results of the first J input time-series trajectory image frames X;
C2、预测结果用以下公式表示:C2. The prediction result is expressed by the following formula:
Xt+1,...,Xt+k≈g解码(f编码(Xt-J+1,Xt-J+2,...,Xt)) (6)X t+1 ,...,X t+k ≈g decoding (f encoding (X t-J+1 ,X t-J+2 ,...,X t )) (6)
结束。Finish.
与现有技术相比,本发明具有以下有益效果:Compared with the prior art, the present invention has the following beneficial effects:
1、本发明的动态链接结构利用数据驱动的方式首先学习特征图的坐标位置偏移,然后采用双线性插值把原特征图中的像素值映射到新特征图坐标位置上,最后用固定连接的卷积核进行下采样来实现动态链接变化。与现有技术相比,本发明不仅能根据卷积核采样点的动态变化模拟到多智能体流体时空运动,而且能够提取多智能体所处位置的空间特征。相比较先前模型的固定卷积循环网络结构,本发明不仅在结构上以及空间数据提取上有很大改进,而且能根据数据学习到具体在特征图上采样那些像素点,减少了空间特征冗余。1. The dynamic link structure of the present invention uses a data-driven approach to first learn the coordinate position offset of the feature map, then uses bilinear interpolation to map the pixel values in the original feature map to the coordinate positions of the new feature map, and finally uses a fixed connection The convolution kernel is down-sampled to achieve dynamic link changes. Compared with the prior art, the present invention can not only simulate the space-time movement of the multi-agent fluid according to the dynamic change of the sampling point of the convolution kernel, but also extract the spatial characteristics of the location of the multi-agent. Compared with the fixed convolutional loop network structure of the previous model, the present invention not only greatly improves the structure and spatial data extraction, but also learns to sample those pixels on the feature map according to the data, reducing the redundancy of spatial features .
2、本发明采用数据驱动的方式根据固定卷积核在特征图上学习到权重,然后采用sigmoid函数对学习到的权重值操作,得到时空数据的采样幅度(即像素点的采样概率),更加符合客观采样规律,提高模型泛化能力。2. The present invention uses a data-driven approach to learn the weights on the feature map according to the fixed convolution kernel, and then uses the sigmoid function to operate the learned weight values to obtain the sampling range of the spatio-temporal data (i.e. the sampling probability of the pixel points), which is more It conforms to the objective sampling law and improves the generalization ability of the model.
3、本发明采用的时空预测模型把密集多智能体的运动看成时空数据像素预测问题。这种预测技术无需采用智能体轨迹点,可以实现多步预测、提高模型泛化能力,减少了计算复杂度。3. The spatio-temporal prediction model adopted in the present invention regards the movement of dense multi-agents as a spatio-temporal data pixel prediction problem. This prediction technology does not need to use the trajectory points of the agent, can realize multi-step prediction, improve the generalization ability of the model, and reduce the computational complexity.
4、本发明采用重采样技巧,使得采样这个操作不用参与梯度下降操作,改为采样结果参与,使得模型既可以减少参数里又可以参与训练。4. The present invention adopts the resampling technique, so that the sampling operation does not need to participate in the gradient descent operation, but instead participates in the sampling result, so that the model can reduce parameters and participate in training.
附图说明Description of drawings
本发明共有附图4张,其中:The present invention has 4 accompanying drawings, wherein:
图1是动态链接单元结构。Figure 1 is a dynamic link unit structure.
图2是编码解码结构。Figure 2 is the codec structure.
图3是预测结果图。Figure 3 is a graph of the prediction results.
图4是本发明的流程图。Fig. 4 is a flowchart of the present invention.
具体实施方式Detailed ways
下面结合附图对本发明再进行进一步地描述。按照图4所示的流程对第三视角下的动态链接(如图1所示)的多智能体轨迹预测方法来介绍。首先输入连续具有时序相关的轨迹帧到编码网络中的变分自编码器的视觉组件部分进行编码,使输入的高维轨迹帧变成低维度的潜在变量。其具体操作是把输入时序轨迹帧通过全连接操作映射成高维向量,然后采用卷积操作进行下采样,得到低维度向量表示。低维度的潜在变量输入到编码网络(如图2所示)中的动态链接单元2可以提取到轨迹的动态时空数据特征。其具体操作,是首先根据固定卷积核大小学习到位置偏移,然后依据双线性差值获取到原先特征图到新特征图的像素对应关系,最后采用同样大小、固定的卷积核对新特征图进行下采样,以完成动态链接结构对时空数据采样。待提取完成时空特征向量,使其流向动态链接单元1继续进行时空特征提取。以此类推,直到输入连续轨迹帧在编码网络训练完成。把编码网络的训练的权重复制到解码网络结构中的动态链接单元1和动态链接单元2之中,然后对其进行解码输出预测的连续轨迹帧(图3所示),第一列是输入的历史运动轨迹序列,第二列是真实的运动轨迹序列,第三列,是预测出来的轨迹序列。The present invention will be further described below in conjunction with the accompanying drawings. According to the process shown in Fig. 4, the multi-agent trajectory prediction method of the dynamic link (as shown in Fig. 1) under the third perspective is introduced. Firstly, the continuous temporally related trajectory frames are input to the visual component part of the variational autoencoder in the encoding network for encoding, so that the input high-dimensional trajectory frames become low-dimensional latent variables. The specific operation is to map the input time-series trajectory frame into a high-dimensional vector through a fully connected operation, and then use a convolution operation for down-sampling to obtain a low-dimensional vector representation. Low-dimensional latent variables are input to the dynamic link unit 2 in the encoding network (as shown in Figure 2) to extract the dynamic spatiotemporal data features of the trajectory. The specific operation is to first learn the position offset according to the fixed convolution kernel size, then obtain the pixel correspondence between the original feature map and the new feature map according to the bilinear difference, and finally use the same size and fixed convolution kernel to compare the new The feature map is down-sampled to complete the sampling of spatio-temporal data by the dynamic link structure. The spatio-temporal feature vectors to be extracted flow to the
Claims (1)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910807587.XA CN110503073B (en) | 2019-08-29 | 2019-08-29 | Dense multi-agent track prediction method for dynamic link at third view angle |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910807587.XA CN110503073B (en) | 2019-08-29 | 2019-08-29 | Dense multi-agent track prediction method for dynamic link at third view angle |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN110503073A CN110503073A (en) | 2019-11-26 |
| CN110503073B true CN110503073B (en) | 2023-04-18 |
Family
ID=68590420
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201910807587.XA Active CN110503073B (en) | 2019-08-29 | 2019-08-29 | Dense multi-agent track prediction method for dynamic link at third view angle |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN110503073B (en) |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111814915B (en) * | 2020-08-26 | 2020-12-25 | 中国科学院自动化研究所 | Multi-agent space-time feature extraction method and system and behavior decision method and system |
| CN113111581B (en) * | 2021-04-09 | 2022-03-11 | 重庆邮电大学 | Combining spatiotemporal factors and graph neural network-based LSTM trajectory prediction method |
| CN114357232A (en) * | 2021-11-29 | 2022-04-15 | 武汉理工大学 | Processing method, system, device and storage medium for extracting ship track line features |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20080152217A1 (en) * | 2006-05-16 | 2008-06-26 | Greer Douglas S | System and method for modeling the neocortex and uses therefor |
| CN108229338A (en) * | 2017-12-14 | 2018-06-29 | 华南理工大学 | A kind of video behavior recognition methods based on depth convolution feature |
| CN108334897A (en) * | 2018-01-22 | 2018-07-27 | 上海海事大学 | A kind of floating marine object trajectory predictions method based on adaptive GMM |
-
2019
- 2019-08-29 CN CN201910807587.XA patent/CN110503073B/en active Active
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20080152217A1 (en) * | 2006-05-16 | 2008-06-26 | Greer Douglas S | System and method for modeling the neocortex and uses therefor |
| CN108229338A (en) * | 2017-12-14 | 2018-06-29 | 华南理工大学 | A kind of video behavior recognition methods based on depth convolution feature |
| CN108334897A (en) * | 2018-01-22 | 2018-07-27 | 上海海事大学 | A kind of floating marine object trajectory predictions method based on adaptive GMM |
Non-Patent Citations (5)
| Title |
|---|
| "基于图像处理的人群行为识别方法综述";高玄 等;《计算机与数字工程》;第44卷(第8期);全文 * |
| "基于深度卷积长短时神经网络的视频帧预测";张德正 等;《计算机应用》;全文 * |
| "红外近距离单目标的检测算法分析";楼枫;《中国优秀硕士学位论文全文数据库 信息科技辑》;全文 * |
| Fengbin Zheng 等."Target Recognition and Change Detection of SAR Image Based on Deep Learning".《Proceedings of The 2019 World Congress on Computational Intelligence, Engineerin * |
| g and Information Technology (WCEIT 2019)》.2019,全文. * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN110503073A (en) | 2019-11-26 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Kocabas et al. | PARE: Part attention regressor for 3D human body estimation | |
| CN110503073B (en) | Dense multi-agent track prediction method for dynamic link at third view angle | |
| AU2020103715A4 (en) | Method of monocular depth estimation based on joint self-attention mechanism | |
| CN110119703B (en) | Human body action recognition method fusing attention mechanism and spatio-temporal graph convolutional neural network in security scene | |
| Sun et al. | Prediction of Short‐Time Rainfall Based on Deep Learning | |
| Trumble et al. | Deep autoencoder for combined human pose estimation and body model upscaling | |
| CN107679526B (en) | Human face micro-expression recognition method | |
| CN104281853B (en) | A kind of Activity recognition method based on 3D convolutional neural networks | |
| CN110532871A (en) | The method and apparatus of image procossing | |
| CN110287846A (en) | A face key point detection method based on attention mechanism | |
| CN111401207B (en) | Human body action recognition method based on MARS depth feature extraction and enhancement | |
| CN109522850A (en) | A kind of movement similarity estimating method based on small-sample learning | |
| CN111222459B (en) | Visual angle independent video three-dimensional human body gesture recognition method | |
| CN114782311A (en) | Improved multi-scale defect target detection method and system based on CenterNet | |
| CN112967227A (en) | Automatic diabetic retinopathy evaluation system based on focus perception modeling | |
| CN116776269A (en) | Traffic anomaly detection method based on graph convolution neural network self-encoder | |
| CN110992414A (en) | Indoor monocular scene depth estimation method based on convolutional neural network | |
| CN118629085A (en) | A capture method based on video stream posture simulation | |
| CN109785279A (en) | An image fusion reconstruction method based on deep learning | |
| CN118229554A (en) | Implicit transducer high-multispectral remote sensing fusion method | |
| CN111476133A (en) | Object extraction method for unmanned vehicle-oriented foreground and background encoder-decoder network | |
| CN114638408A (en) | A Pedestrian Trajectory Prediction Method Based on Spatio-temporal Information | |
| Farazi et al. | Frequency domain transformer networks for video prediction | |
| CN115131414B (en) | UAV image alignment method, electronic device and storage medium based on deep learning | |
| CN112967309B (en) | A Video Object Segmentation Method Based on Self-Supervised Learning |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |