Chen et al., 2017 - Google Patents

DCASE2017 sound event detection using convolutional neural network

Chen et al., 2017

Document ID: 1235584743358111077
Author: Chen Y; Zhang Y; Duan Z
Publication year: 2017
Publication venue: Detection and classification of acoustic scenes and events

External Links

Cited by

Snippet

ABSTRACT The DCASE2017 Challenge Task 3 is to develop a sound event detection system of real life audio. In our setup, we merge the two channels into one, then use Mel- band energy to calculate the converted spectrum, and train the model using a convolutional …

Continue reading at dcase.community (PDF) (other versions)

238000001514 detection method 0 title abstract description 14

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding or deleting nodes or connections, pruning
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/6247—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00335—Recognising movements or behaviour, e.g. recognition of gestures, dynamic facial expressions; Lip-reading
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
- G06K9/00771—Recognising scenes under surveillance, e.g. with Markovian modelling of scene activity
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions

Similar Documents

Publication	Publication Date	Title
Chen et al.	2017	DCASE2017 sound event detection using convolutional neural network
Khaire et al.	2022	A semi-supervised deep learning based video anomaly detection framework using RGB-D for surveillance of real-world critical environments
Lea et al.	2017	Temporal convolutional networks for action segmentation and detection
Amiriparian et al.	2018	Bag-of-deep-features: Noise-robust deep feature representations for audio analysis
Yu et al.	2018	A multi-layer parallel lstm network for human activity recognition with smartphone sensors
CN112183107A (en)	2021-01-05	Audio processing method and device
CN115273814B (en)	2025-04-29	Fake voice detection method, device, computer equipment and storage medium
Dang et al.	2017	A survey of deep learning for polyphonic sound event detection
Ding et al.	2019	Adaptive multi-scale detection of acoustic events
Tzinis et al.	2018	Integrating recurrence dynamics for speech emotion recognition
Liu et al.	2021	Graph Isomorphism Network for Speech Emotion Recognition.
Zhang et al.	2020	MTF-CRNN: Multiscale time-frequency convolutional recurrent neural network for sound event detection
Ying et al.	2024	A multimodal driver emotion recognition algorithm based on the audio and video signals in internet of vehicles platform
US9269045B2 (en)	2016-02-23	Auditory source separation in a spiking neural network
Shahin et al.	2021	COVID-19 electrocardiograms classification using CNN models
Jung et al.	2017	DNN-Based Audio Scene Classification for DCASE2017: Dual Input Features, Balancing Cost, and Stochastic Data Duplication.
Han et al.	2018	Anomaly detection in health data based on deep learning
Vlasov et al.	2022	Spoken digits classification based on spiking neural networks with memristor-based STDP
Dang et al.	2017	Deep learning for DCASE2017 challenge
Du et al.	2025	Speech emotion recognition based on spiking neural network and convolutional neural network
Marepalli et al.	2024	Early Detection of Chronic Obstructive Pulmonary Disease in Respiratory Audio Signals Using CNN and LSTM Models
Kächele et al.	2015	Fusion mappings for multimodal affect recognition
Islam et al.	2021	DCNN-LSTM based audio classification combining multiple feature engineering and data augmentation techniques
Muscar et al.	2024	Deep Learning-Based Sound Classification Algorithms for Enhanced Service Robots Audio Capabilities
Rohan et al.	2020	Emotion recognition through speech signal using python