Mihkla et al., 2008 - Google Patents

Development of a unit selection TTS system for Estonian

Mihkla et al., 2008

Document ID: 258001966201930954
Author: Mihkla M; Piits L; Nurk T; Kiissel I
Publication year: 2008
Publication venue: Proceedings of the Third Baltic Conference on Human Language Technologies

External Links

Cited by

Snippet

The first generation of Estonian corpus-based TTS systems appeared in 2002. That speech synthesizer used the MBROLA speech engine and its diphone database contained just one sample of each diphone occurring in the Estonian language. In 2006, a new project was …

Continue reading at www.researchgate.net (PDF) (other versions)

238000011161 development 0 title description 5

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
- G10L13/10—Prosody rules derived from text; Stress or intonation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
- G10L13/07—Concatenation rules
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/027—Concept to speech synthesisers; Generation of natural phrases from machine-based concepts
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/04—Details of speech synthesis systems, e.g. synthesiser structure or memory management
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids transforming into visible information
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification

Similar Documents

Publication	Publication Date	Title
US8219398B2 (en)	2012-07-10	Computerized speech synthesizer for synthesizing speech from text
Dutoit	1997	High-quality text-to-speech synthesis: An overview
US20090048843A1 (en)	2009-02-19	System-effected text annotation for expressive prosody in speech synthesis and recognition
JP3587048B2 (en)	2004-11-10	Prosody control method and speech synthesizer
Mache et al.	2015	Review on text-to-speech synthesizer
Macchi	1998	Issues in text-to-speech synthesis
Dutoit	1999	A short introduction to text-to-speech synthesis
RU2386178C2 (en)	2010-04-10	Method for preliminary processing of text
Mihkla et al.	2008	Development of a unit selection TTS system for Estonian
Murthy et al.	2010	Building unit selection speech synthesis in Indian languages: An initiative by an Indian consortium
Shirbahadurkar et al.	2009	Marathi language speech synthesizer using concatenative synthesis strategy (spoken in Maharashtra, India)
Ekpenyong et al.	2008	Towards an unrestricted domain TTS system for African tone languages
Lobanov et al.	2006	Development of multi-voice and multi-language TTS synthesizer (languages: Belarussian, Polish, Russian)
Roux et al.	2007	Data-driven approach to rapid prototyping Xhosa speech synthesis
Khalifa et al.	2008	SMaTalk: Standard malay text to speech talk system
Nguyen	2015	Hmm-based vietnamese text-to-speech: Prosodic phrasing modeling, corpus design system design, and evaluation
Ng	1998	Survey of data-driven approaches to Speech Synthesis
Kaur et al.	2017	BUILDING AText-TO-SPEECH SYSTEM FOR PUNJABI LANGUAGE
Allen	1980	Speech synthesis from text
Akinwonmi et al.	2013	A prosodic text-to-speech system for yorùbá language
Narupiyakul et al.	2005	A stochastic knowledge-based Thai text-to-speech system
Othman et al.	2007	SMaTTS: Standard malay text to speech system
IMRAN	2020	ADMAS UNIVERSITY SCHOOL OF POST GRADUATE STUDIES DEPARTMENT OF COMPUTER SCIENCE
Kasie et al.	2012	Concatenative speech synthesis for Amharic using unit selection method
Hamza et al.	2005	On building a concatenative speech synthesis system from the blizzard challenge speech databases.