Roma et al., 2016 - Google Patents
Untwist: A new toolbox for audio source separationRoma et al., 2016
View PDF- Document ID
- 779757486649663853
- Author
- Roma G
- Grais E
- Simpson A
- Sobieraj I
- Plumbley M
- Publication year
- Publication venue
- Extended abstracts for the late-breaking demo session of the 17th international society for music information retrieval conference, ismir
External Links
Snippet
Untwist is a new open source toolbox for audio source separation. The library provides a self- contained objectoriented framework including common source separation algorithms as well as input/output functions, data management utilities and time-frequency transforms …
- 238000000926 separation method 0 title abstract description 25
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/141—Discrete Fourier transforms
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Kadıoğlu et al. | An empirical study of Conv-TasNet | |
| US20230162758A1 (en) | Systems and methods for speech enhancement using attention masking and end to end neural networks | |
| Grais et al. | Single channel speech music separation using nonnegative matrix factorization and spectral masks | |
| Grais et al. | Two-stage single-channel audio source separation using deep neural networks | |
| CN102637435A (en) | Audio signal processing device, audio signal processing method, and program | |
| US9646631B2 (en) | Audio signal processing apparatus and method thereof | |
| EP3143619A1 (en) | Method and system of on-the-fly audio source separation | |
| CN1653519A (en) | Method for robust voice recognition by analyzing redundant features of source signal | |
| WO2021161543A1 (en) | Signal processing device, signal processing method, and signal processing program | |
| US20220130407A1 (en) | Method for isolating sound, electronic equipment, and storage medium | |
| Weninger et al. | Optimization and parallelization of monaural source separation algorithms in the openBliSSART toolkit | |
| Firooz et al. | Improvement of automatic speech recognition systems via nonlinear dynamical features evaluated from the recurrence plot of speech signals | |
| CN112116922B (en) | Noise blind source signal separation method, terminal equipment and storage medium | |
| Roma et al. | Untwist: A new toolbox for audio source separation | |
| JP2020012928A (en) | Noise-tolerant speech recognition apparatus and method, and computer program | |
| KR20180079975A (en) | Sound source separation method using spatial position of the sound source and non-negative matrix factorization and apparatus performing the method | |
| Kumar et al. | Speech mel frequency cepstral coefficient feature classification using multi level support vector machine | |
| Weninger et al. | Recognition of nonprototypical emotions in reverberated and noisy speech by nonnegative matrix factorization | |
| Li et al. | TEnet: target speaker extraction network with accumulated speaker embedding for automatic speech recognition | |
| US11580967B2 (en) | Speech feature extraction apparatus, speech feature extraction method, and computer-readable storage medium | |
| Hidayat et al. | Feature extraction of the Indonesian phonemes using discrete wavelet and wavelet packet transform | |
| Baranwal et al. | A speech recognition technique using mfcc with dwt in isolated hindi words | |
| TWI409802B (en) | Method and apparatus for processing audio feature | |
| Burred | Cross-synthesis based on spectrogram factorization | |
| Tran et al. | Towards privacy-preserving speech representation for client-side data sharing |