[go: up one dir, main page]

Chen et al., 2017 - Google Patents

DCASE2017 sound event detection using convolutional neural network

Chen et al., 2017

View PDF
Document ID
1235584743358111077
Author
Chen Y
Zhang Y
Duan Z
Publication year
Publication venue
Detection and classification of acoustic scenes and events

External Links

Snippet

ABSTRACT The DCASE2017 Challenge Task 3 is to develop a sound event detection system of real life audio. In our setup, we merge the two channels into one, then use Mel- band energy to calculate the converted spectrum, and train the model using a convolutional …
Continue reading at dcase.community (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding or deleting nodes or connections, pruning
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/04Architectures, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6267Classification techniques
    • G06K9/6268Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06K9/6232Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
    • G06K9/6247Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • G06K9/00268Feature extraction; Face representation
    • G06K9/00281Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00335Recognising movements or behaviour, e.g. recognition of gestures, dynamic facial expressions; Lip-reading
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00624Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
    • G06K9/00771Recognising scenes under surveillance, e.g. with Markovian modelling of scene activity
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions

Similar Documents

Publication Publication Date Title
Chen et al. DCASE2017 sound event detection using convolutional neural network
Khaire et al. A semi-supervised deep learning based video anomaly detection framework using RGB-D for surveillance of real-world critical environments
Lea et al. Temporal convolutional networks for action segmentation and detection
Amiriparian et al. Bag-of-deep-features: Noise-robust deep feature representations for audio analysis
Yu et al. A multi-layer parallel lstm network for human activity recognition with smartphone sensors
CN112183107A (en) Audio processing method and device
CN115273814B (en) Fake voice detection method, device, computer equipment and storage medium
Dang et al. A survey of deep learning for polyphonic sound event detection
Ding et al. Adaptive multi-scale detection of acoustic events
Tzinis et al. Integrating recurrence dynamics for speech emotion recognition
Liu et al. Graph Isomorphism Network for Speech Emotion Recognition.
Zhang et al. MTF-CRNN: Multiscale time-frequency convolutional recurrent neural network for sound event detection
Ying et al. A multimodal driver emotion recognition algorithm based on the audio and video signals in internet of vehicles platform
US9269045B2 (en) Auditory source separation in a spiking neural network
Shahin et al. COVID-19 electrocardiograms classification using CNN models
Jung et al. DNN-Based Audio Scene Classification for DCASE2017: Dual Input Features, Balancing Cost, and Stochastic Data Duplication.
Han et al. Anomaly detection in health data based on deep learning
Vlasov et al. Spoken digits classification based on spiking neural networks with memristor-based STDP
Dang et al. Deep learning for DCASE2017 challenge
Du et al. Speech emotion recognition based on spiking neural network and convolutional neural network
Marepalli et al. Early Detection of Chronic Obstructive Pulmonary Disease in Respiratory Audio Signals Using CNN and LSTM Models
Kächele et al. Fusion mappings for multimodal affect recognition
Islam et al. DCNN-LSTM based audio classification combining multiple feature engineering and data augmentation techniques
Muscar et al. Deep Learning-Based Sound Classification Algorithms for Enhanced Service Robots Audio Capabilities
Rohan et al. Emotion recognition through speech signal using python