Yu et al., 2024 - Google Patents
Svformer: a direct training spiking transformer for efficient video action recognitionYu et al., 2024
View PDF- Document ID
- 9738236995886040139
- Author
- Yu L
- Huang L
- Zhou C
- Zhang H
- Ma Z
- Zhou H
- Tian Y
- Publication year
- Publication venue
- International Workshop on Human Brain and Artificial Intelligence
External Links
Snippet
Video action recognition (VAR) plays crucial roles in various domains such as surveillance, healthcare, and industrial automation, making it highly significant for the society. Consequently, it has long been a research spot in the computer vision field. As artificial …
- 238000012421 spiking 0 title abstract description 26
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/6247—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/03—Arrangements for converting the position or the displacement of a member into a coded form
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce, e.g. shopping or e-commerce
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Kim et al. | Neural architecture search for spiking neural networks | |
| Liu et al. | Semantics-aware adaptive knowledge distillation for sensor-to-vision action recognition | |
| Yu et al. | Searching multi-rate and multi-modal temporal enhanced networks for gesture recognition | |
| Kim et al. | Beyond classification: Directly training spiking neural networks for semantic segmentation | |
| Liu et al. | Spatiotemporal saliency-based multi-stream networks with attention-aware LSTM for action recognition | |
| Hu et al. | Boosting lightweight depth estimation via knowledge distillation | |
| Wang et al. | Sstformer: Bridging spiking neural network and memory support transformer for frame-event based recognition | |
| Sagar | Dmsanet: Dual multi scale attention network | |
| Li et al. | Multi-scale residual network model combined with Global Average Pooling for action recognition | |
| Yin et al. | Efficient skeleton-based action recognition via multi-stream depthwise separable convolutional neural network | |
| Ma et al. | Multi-view time-series hypergraph neural network for action recognition | |
| Tseng et al. | MiniNet: Dense squeeze with depthwise separable convolutions for image classification in resource-constrained autonomous systems | |
| Yu et al. | Svformer: a direct training spiking transformer for efficient video action recognition | |
| Ning et al. | Very fast semantic image segmentation using hierarchical dilation and feature refining | |
| Yu et al. | Hand gesture recognition based on attentive feature fusion | |
| Zhong et al. | Multimodal cooperative self‐attention network for action recognition | |
| Wu et al. | STCA-SNN: self-attention-based temporal-channel joint attention for spiking neural networks | |
| Li et al. | Adaptive feature self-attention in spiking neural networks for hyperspectral classification | |
| Chen et al. | Retain, blend, and exchange: A quality-aware spatial-stereo fusion approach for event stream recognition | |
| Liu et al. | MSSTResNet-TLD: A robust tracking method based on tracking-learning-detection framework by using multi-scale spatio-temporal residual network feature model | |
| Li et al. | Manet: motion-aware network for video action recognition | |
| Sharma et al. | BMST-Net: bidirectional multi-scale spatiotemporal network for salient object detection in videos | |
| Deng et al. | A dynamic graph cnn with cross-representation distillation for event-based recognition | |
| Zhang et al. | Razor SNN: efficient spiking neural network with temporal embeddings | |
| Ji et al. | A recognition method for Italian alphabet gestures based on convolutional neural network |