[go: up one dir, main page]

Li et al., 2021 - Google Patents

Oriental language recognition (OLR) 2020: Summary and analysis

Li et al., 2021

View PDF
Document ID
16424020177847016000
Author
Li J
Wang B
Zhi Y
Li Z
Li L
Hong Q
Wang D
Publication year
Publication venue
arXiv preprint arXiv:2107.05365

External Links

Snippet

The fifth Oriental Language Recognition (OLR) Challenge focuses on language recognition in a variety of complex environments to promote its development. The OLR 2020 Challenge includes three tasks:(1) cross-channel language identification,(2) dialect identification, and …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • G10L15/144Training of HMMs
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/06Decision making techniques; Pattern matching strategies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/005Language recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass

Similar Documents

Publication Publication Date Title
Zhang et al. End-to-end attention based text-dependent speaker verification
Ma et al. Short utterance based speech language identification in intelligent vehicles with time-scale modifications and deep bottleneck features
Poddar et al. Speaker verification with short utterances: a review of challenges, trends and opportunities
Song et al. Noise invariant frame selection: a simple method to address the background noise problem for text-independent speaker verification
Thiolliere et al. A hybrid dynamic time warping-deep neural network architecture for unsupervised acoustic modeling.
Li et al. Oriental language recognition (OLR) 2020: Summary and analysis
TWI395201B (en) Method and system for identifying emotional voices
Xu English speech recognition and evaluation of pronunciation quality using deep learning
CN104240706B (en) It is a kind of that the method for distinguishing speek person that similarity corrects score is matched based on GMM Token
CN114299918B (en) Acoustic model training and speech synthesis method, device and system and storage medium
CN107093422A (en) A kind of audio recognition method and speech recognition system
Mao et al. Applying multitask learning to acoustic-phonemic model for mispronunciation detection and diagnosis in l2 english speech
Velichko et al. Complex Paralinguistic Analysis of Speech: Predicting Gender, Emotions and Deception in a Hierarchical Framework.
CN106297769A (en) A kind of distinctive feature extracting method being applied to languages identification
CN114220419A (en) A voice evaluation method, device, medium and equipment
Dar et al. Bi-directional LSTM-based isolated spoken word recognition for Kashmiri language utilizing Mel-spectrogram feature
Bera et al. Identification of mental state through speech using a deep learning approach
Luo et al. Emotional Voice Conversion Using Neural Networks with Different Temporal Scales of F0 based on Wavelet Transform.
Ansari et al. Deep learning methods for unsupervised acoustic modeling—leap submission to zerospeech challenge 2017
Yu et al. Language Recognition Based on Unsupervised Pretrained Models.
Mansour et al. Speaker recognition in emotional context
Park et al. Automatic speech recognition system-independent word error rate estimation
Zhang et al. Discriminatively trained sparse inverse covariance matrices for speech recognition
Kalita et al. Use of bidirectional long short term memory in spoken word detection with reference to the Assamese language
Pereira et al. Automatic phoneme recognition by deep neural networks