Gosztolya et al., 2019 - Google Patents

Calibrating AdaBoost for phoneme classification

Gosztolya et al., 2019

Document ID: 9727398424899525136
Author: Gosztolya G; Busa-Fekete R
Publication year: 2019
Publication venue: Soft Computing

External Links

Cited by

Snippet

Phoneme classification is a classification sub-task of automatic speech recognition (ASR), which is essential in order to achieve good speech recognition accuracy. However, unlike most classification tasks, besides finding the correct class, providing good posterior scores is …

Continue reading at www.inf.u-szeged.hu (PDF) (other versions)

238000000034 method 0 abstract description 37

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
- G10L15/144—Training of HMMs
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/197—Probabilistic grammars, e.g. word n-grams
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6279—Classification techniques relating to the number of classes
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6296—Graphical models, e.g. Bayesian networks
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/18—Digital computers in general; Data processing equipment in general in which a programme is changed according to experience gained by the computer itself during a complete run; Learning machines

Similar Documents

Publication	Publication Date	Title
Zazo et al.	2016	Language identification in short utterances using long short-term memory (LSTM) recurrent neural networks
Ganapathiraju	2002	Support vector machines for speech recognition
Lin et al.	2009	How to select a good training-data subset for transcription: Submodular active selection for sequences
Wang et al.	2013	Using parallel tokenizers with DTW matrix combination for low-resource spoken term detection
Kadyan et al.	2021	Enhancing accuracy of long contextual dependencies for Punjabi speech recognition system using deep LSTM
Shinoda	2011	Speaker adaptation techniques for automatic speech recognition
Kinnunen et al.	2009	Comparative evaluation of maximum a posteriori vector quantization and Gaussian mixture models in speaker verification
Nazir et al.	2023	A computer-aided speech analytics approach for pronunciation feedback using deep feature clustering
Gosztolya et al.	2019	Calibrating AdaBoost for phoneme classification
Zhao et al.	2014	Ensemble learning approaches in speech recognition
Hamooni et al.	2016	Phoneme sequence recognition via DTW-based classification
Lopes et al.	2012	Broad phonetic class definition driven by phone confusions
Lee	2008	Principles of spoken language recognition
Manjunath et al.	2020	Articulatory-feature-based methods for performance improvement of Multilingual Phone Recognition Systems using Indian languages
Madhavi et al.	2018	Design of mixture of GMMs for query-by-example spoken term detection
Gündogdu	2017	Keyword search for low resource languages
Chou et al.	2003	Minimum classification error (MCE) approach in pattern recognition
Shinoda	2010	Acoustic model adaptation for speech recognition
Tomashenko	2017	Speaker adaptation of deep neural network acoustic models using Gaussian mixture model framework in automatic speech recognition systems
Ferrer	2009	Statistical modeling of heterogeneous features for speech processing tasks
Matton et al.	2010	Minimum classification error training in example based speech and pattern recognition using sparse weight matrices
Bykov et al.	2019	Improvement of the learning process of the automated speaker recognition system for critical use with HMM-DNN component
Sen et al.	2019	Audio classification
Jyothi et al.	2014	Revisiting word neighborhoods for speech recognition
Patil et al.	2019	Linear collaborative discriminant regression and Cepstra features for Hindi speech recognition