Zhang et al., 2014 - Google Patents
Distant-talking speaker identification by generalized spectral subtraction-based dereverberation and its efficient computationZhang et al., 2014
View HTML- Document ID
- 13891794034993811758
- Author
- Zhang Z
- Wang L
- Kai A
- Publication year
- Publication venue
- EURASIP Journal on Audio, Speech, and Music Processing
External Links
Snippet
Previously, a dereverberation method based on generalized spectral subtraction (GSS) using multi-channel least mean-squares (MCLMS) has been proposed. The results of speech recognition experiments showed that this method achieved a significant …
- 230000003595 spectral 0 title abstract description 26
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0202—Applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Tan et al. | Neural spectrospatial filtering | |
| Taherian et al. | Robust speaker recognition based on single-channel and multi-channel speech enhancement | |
| Delcroix et al. | Strategies for distant speech recognitionin reverberant environments | |
| Zhang et al. | Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification | |
| Shimada et al. | Unsupervised speech enhancement based on multichannel NMF-informed beamforming for noise-robust automatic speech recognition | |
| JP5738020B2 (en) | Speech recognition apparatus and speech recognition method | |
| Cauchi et al. | Combination of MVDR beamforming and single-channel spectral processing for enhancing noisy and reverberant speech | |
| Li et al. | Multichannel speech enhancement based on time-frequency masking using subband long short-term memory | |
| Perotin et al. | Multichannel speech separation with recurrent neural networks from high-order ambisonics recordings | |
| Xiao et al. | Speech dereverberation for enhancement and recognition using dynamic features constrained deep neural networks and feature adaptation | |
| Xiao et al. | The NTU-ADSC systems for reverberation challenge 2014 | |
| Nakatani et al. | Dominance based integration of spatial and spectral features for speech enhancement | |
| Wang et al. | Dereverberation and denoising based on generalized spectral subtraction by multi-channel LMS algorithm using a small-scale microphone array | |
| Zhang et al. | Distant-talking speaker identification by generalized spectral subtraction-based dereverberation and its efficient computation | |
| Xiong et al. | Front-end technologies for robust ASR in reverberant environments—spectral enhancement-based dereverberation and auditory modulation filterbank features | |
| Nesta et al. | A flexible spatial blind source extraction framework for robust speech recognition in noisy environments | |
| JP5180928B2 (en) | Speech recognition apparatus and mask generation method for speech recognition apparatus | |
| Song et al. | An integrated multi-channel approach for joint noise reduction and dereverberation | |
| Squartini et al. | Environmental robust speech and speaker recognition through multi-channel histogram equalization | |
| Alam et al. | Speech recognition in reverberant and noisy environments employing multiple feature extractors and i-vector speaker adaptation | |
| Astudillo et al. | Integration of beamforming and uncertainty-of-observation techniques for robust ASR in multi-source environments | |
| Wang et al. | Distant-talking speech recognition based on spectral subtraction by multi-channel LMS algorithm | |
| Mowlaee | CHiME challenge: Approaches to robustness using beamforming and uncertainty-of-observation techniques | |
| Chen et al. | A multichannel learning-based approach for sound source separation in reverberant environments | |
| Mandel et al. | Multichannel Spatial Clustering for Robust Far-Field Automatic Speech Recognition in Mismatched Conditions. |