[go: up one dir, main page]

Roma et al., 2016 - Google Patents

Untwist: A new toolbox for audio source separation

Roma et al., 2016

View PDF
Document ID
779757486649663853
Author
Roma G
Grais E
Simpson A
Sobieraj I
Plumbley M
Publication year
Publication venue
Extended abstracts for the late-breaking demo session of the 17th international society for music information retrieval conference, ismir

External Links

Snippet

Untwist is a new open source toolbox for audio source separation. The library provides a self- contained objectoriented framework including common source separation algorithms as well as input/output functions, data management utilities and time-frequency transforms …
Continue reading at www.researchgate.net (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/141Discrete Fourier transforms
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification

Similar Documents

Publication Publication Date Title
Kadıoğlu et al. An empirical study of Conv-TasNet
US20230162758A1 (en) Systems and methods for speech enhancement using attention masking and end to end neural networks
Grais et al. Single channel speech music separation using nonnegative matrix factorization and spectral masks
Grais et al. Two-stage single-channel audio source separation using deep neural networks
CN102637435A (en) Audio signal processing device, audio signal processing method, and program
US9646631B2 (en) Audio signal processing apparatus and method thereof
EP3143619A1 (en) Method and system of on-the-fly audio source separation
CN1653519A (en) Method for robust voice recognition by analyzing redundant features of source signal
WO2021161543A1 (en) Signal processing device, signal processing method, and signal processing program
US20220130407A1 (en) Method for isolating sound, electronic equipment, and storage medium
Weninger et al. Optimization and parallelization of monaural source separation algorithms in the openBliSSART toolkit
Firooz et al. Improvement of automatic speech recognition systems via nonlinear dynamical features evaluated from the recurrence plot of speech signals
CN112116922B (en) Noise blind source signal separation method, terminal equipment and storage medium
Roma et al. Untwist: A new toolbox for audio source separation
JP2020012928A (en) Noise-tolerant speech recognition apparatus and method, and computer program
KR20180079975A (en) Sound source separation method using spatial position of the sound source and non-negative matrix factorization and apparatus performing the method
Kumar et al. Speech mel frequency cepstral coefficient feature classification using multi level support vector machine
Weninger et al. Recognition of nonprototypical emotions in reverberated and noisy speech by nonnegative matrix factorization
Li et al. TEnet: target speaker extraction network with accumulated speaker embedding for automatic speech recognition
US11580967B2 (en) Speech feature extraction apparatus, speech feature extraction method, and computer-readable storage medium
Hidayat et al. Feature extraction of the Indonesian phonemes using discrete wavelet and wavelet packet transform
Baranwal et al. A speech recognition technique using mfcc with dwt in isolated hindi words
TWI409802B (en) Method and apparatus for processing audio feature
Burred Cross-synthesis based on spectrogram factorization
Tran et al. Towards privacy-preserving speech representation for client-side data sharing