Chen et al., 2017 - Google Patents
DCASE2017 sound event detection using convolutional neural networkChen et al., 2017
View PDF- Document ID
- 1235584743358111077
- Author
- Chen Y
- Zhang Y
- Duan Z
- Publication year
- Publication venue
- Detection and classification of acoustic scenes and events
External Links
Snippet
ABSTRACT The DCASE2017 Challenge Task 3 is to develop a sound event detection system of real life audio. In our setup, we merge the two channels into one, then use Mel- band energy to calculate the converted spectrum, and train the model using a convolutional …
- 238000001514 detection method 0 title abstract description 14
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding or deleting nodes or connections, pruning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/6247—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00335—Recognising movements or behaviour, e.g. recognition of gestures, dynamic facial expressions; Lip-reading
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
- G06K9/00771—Recognising scenes under surveillance, e.g. with Markovian modelling of scene activity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Chen et al. | DCASE2017 sound event detection using convolutional neural network | |
| Khaire et al. | A semi-supervised deep learning based video anomaly detection framework using RGB-D for surveillance of real-world critical environments | |
| Lea et al. | Temporal convolutional networks for action segmentation and detection | |
| Amiriparian et al. | Bag-of-deep-features: Noise-robust deep feature representations for audio analysis | |
| Yu et al. | A multi-layer parallel lstm network for human activity recognition with smartphone sensors | |
| CN112183107A (en) | Audio processing method and device | |
| CN115273814B (en) | Fake voice detection method, device, computer equipment and storage medium | |
| Dang et al. | A survey of deep learning for polyphonic sound event detection | |
| Ding et al. | Adaptive multi-scale detection of acoustic events | |
| Tzinis et al. | Integrating recurrence dynamics for speech emotion recognition | |
| Liu et al. | Graph Isomorphism Network for Speech Emotion Recognition. | |
| Zhang et al. | MTF-CRNN: Multiscale time-frequency convolutional recurrent neural network for sound event detection | |
| Ying et al. | A multimodal driver emotion recognition algorithm based on the audio and video signals in internet of vehicles platform | |
| US9269045B2 (en) | Auditory source separation in a spiking neural network | |
| Shahin et al. | COVID-19 electrocardiograms classification using CNN models | |
| Jung et al. | DNN-Based Audio Scene Classification for DCASE2017: Dual Input Features, Balancing Cost, and Stochastic Data Duplication. | |
| Han et al. | Anomaly detection in health data based on deep learning | |
| Vlasov et al. | Spoken digits classification based on spiking neural networks with memristor-based STDP | |
| Dang et al. | Deep learning for DCASE2017 challenge | |
| Du et al. | Speech emotion recognition based on spiking neural network and convolutional neural network | |
| Marepalli et al. | Early Detection of Chronic Obstructive Pulmonary Disease in Respiratory Audio Signals Using CNN and LSTM Models | |
| Kächele et al. | Fusion mappings for multimodal affect recognition | |
| Islam et al. | DCNN-LSTM based audio classification combining multiple feature engineering and data augmentation techniques | |
| Muscar et al. | Deep Learning-Based Sound Classification Algorithms for Enhanced Service Robots Audio Capabilities | |
| Rohan et al. | Emotion recognition through speech signal using python |