[go: up one dir, main page]

WO2008108232A1 - 音声認識装置、音声認識方法及び音声認識プログラム - Google Patents

音声認識装置、音声認識方法及び音声認識プログラム Download PDF

Info

Publication number
WO2008108232A1
WO2008108232A1 PCT/JP2008/053331 JP2008053331W WO2008108232A1 WO 2008108232 A1 WO2008108232 A1 WO 2008108232A1 JP 2008053331 W JP2008053331 W JP 2008053331W WO 2008108232 A1 WO2008108232 A1 WO 2008108232A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
accuracy
audio recognition
parameter
input signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2008/053331
Other languages
English (en)
French (fr)
Inventor
Takayuki Arakawa
Ken Hanazawa
Masanori Tsujikawa
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to US12/528,022 priority Critical patent/US8612225B2/en
Priority to JP2009502533A priority patent/JP5229216B2/ja
Publication of WO2008108232A1 publication Critical patent/WO2008108232A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)

Abstract

 本発明の目的は、少ない計算コストで認識精度の高くなる適切なパラメータで音声認識を行うことができるようにすることである。音声モデル格納部7は、音声の特徴的性質を表現する複数の詳細度をもつ音声モデルを予め格納する。詳細度判定部9は、音声モデル格納部7が記憶する音声モデルがもつ詳細度のうち、入力信号の特徴的性質に最も近い詳細度を選択する。そして、パラメータ設定部10は、選択した詳細度に応じて、音声認識に係わるパラメータを制御する。そのような構成としたことで、音声モデルの高い詳細度と低い詳細度とで比較したときに、入力信号に対して高い詳細度の方が近いような場合には、計算コストの低いパラメータを用いて音声認識を行う。逆に、入力信号に対して低い詳細度の方が近い場合には、より精度の高くなるようなパラメータを用いて音声認識を行う。
PCT/JP2008/053331 2007-02-28 2008-02-26 音声認識装置、音声認識方法及び音声認識プログラム Ceased WO2008108232A1 (ja)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US12/528,022 US8612225B2 (en) 2007-02-28 2008-02-26 Voice recognition device, voice recognition method, and voice recognition program
JP2009502533A JP5229216B2 (ja) 2007-02-28 2008-02-26 音声認識装置、音声認識方法及び音声認識プログラム

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2007-048898 2007-02-28
JP2007048898 2007-02-28

Publications (1)

Publication Number Publication Date
WO2008108232A1 true WO2008108232A1 (ja) 2008-09-12

Family

ID=39738118

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2008/053331 Ceased WO2008108232A1 (ja) 2007-02-28 2008-02-26 音声認識装置、音声認識方法及び音声認識プログラム

Country Status (4)

Country Link
US (1) US8612225B2 (ja)
JP (1) JP5229216B2 (ja)
CN (1) CN101622660A (ja)
WO (1) WO2008108232A1 (ja)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010204274A (ja) * 2009-03-02 2010-09-16 Toshiba Corp 音声認識装置、その方法及びそのプログラム

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2464093B (en) * 2008-09-29 2011-03-09 Toshiba Res Europ Ltd A speech recognition method
JP2011033680A (ja) * 2009-07-30 2011-02-17 Sony Corp 音声処理装置及び方法、並びにプログラム
FR2964223B1 (fr) * 2010-08-31 2016-04-01 Commissariat Energie Atomique Procede de configuration d'un dispositif de detection a capteur, programme d'ordinateur et dispositif adaptatif correspondants
WO2013003772A2 (en) 2011-06-30 2013-01-03 Google Inc. Speech recognition using variable-length context
US10431235B2 (en) 2012-05-31 2019-10-01 Elwha Llc Methods and systems for speech adaptation data
US10395672B2 (en) 2012-05-31 2019-08-27 Elwha Llc Methods and systems for managing adaptation data
US20130325447A1 (en) * 2012-05-31 2013-12-05 Elwha LLC, a limited liability corporation of the State of Delaware Speech recognition adaptation systems based on adaptation data
US9620128B2 (en) 2012-05-31 2017-04-11 Elwha Llc Speech recognition adaptation systems based on adaptation data
US20130325449A1 (en) 2012-05-31 2013-12-05 Elwha Llc Speech recognition adaptation systems based on adaptation data
US9336771B2 (en) * 2012-11-01 2016-05-10 Google Inc. Speech recognition using non-parametric models
US9646605B2 (en) * 2013-01-22 2017-05-09 Interactive Intelligence Group, Inc. False alarm reduction in speech recognition systems using contextual information
JP6011565B2 (ja) * 2014-03-05 2016-10-19 カシオ計算機株式会社 音声検索装置、音声検索方法及びプログラム
US9858922B2 (en) 2014-06-23 2018-01-02 Google Inc. Caching speech recognition scores
KR102292546B1 (ko) 2014-07-21 2021-08-23 삼성전자주식회사 컨텍스트 정보를 이용하는 음성 인식 방법 및 장치
US9299347B1 (en) 2014-10-22 2016-03-29 Google Inc. Speech recognition using associative mapping
KR102380833B1 (ko) * 2014-12-02 2022-03-31 삼성전자주식회사 음성 인식 방법 및 음성 인식 장치
JP6003972B2 (ja) * 2014-12-22 2016-10-05 カシオ計算機株式会社 音声検索装置、音声検索方法及びプログラム
CN105869641A (zh) * 2015-01-22 2016-08-17 佳能株式会社 语音识别装置及语音识别方法
CN104766607A (zh) * 2015-03-05 2015-07-08 广州视源电子科技股份有限公司 一种电视节目推荐方法与系统
KR102492318B1 (ko) 2015-09-18 2023-01-26 삼성전자주식회사 모델 학습 방법 및 장치, 및 데이터 인식 방법
JP6841232B2 (ja) * 2015-12-18 2021-03-10 ソニー株式会社 情報処理装置、情報処理方法、及びプログラム
JP6495850B2 (ja) * 2016-03-14 2019-04-03 株式会社東芝 情報処理装置、情報処理方法、プログラムおよび認識システム
CN105957516B (zh) * 2016-06-16 2019-03-08 百度在线网络技术(北京)有限公司 多语音识别模型切换方法及装置
US9984688B2 (en) 2016-09-28 2018-05-29 Visteon Global Technologies, Inc. Dynamically adjusting a voice recognition system
KR20210052564A (ko) * 2018-11-05 2021-05-10 주식회사 엘솔루 빅 데이터를 이용한 최적의 언어 모델 생성 방법 및 이를 위한 장치
CN110647367A (zh) * 2019-09-23 2020-01-03 苏州随身玩信息技术有限公司 一种讲解内容自适应切换方法和导游讲解机
US11620982B2 (en) * 2020-06-01 2023-04-04 Rovi Guides, Inc. Systems and methods for improving content discovery in response to a voice query using a recognition rate which depends on detected trigger terms

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0667698A (ja) * 1992-06-19 1994-03-11 Seiko Epson Corp 音声認識装置
JPH08506430A (ja) * 1993-06-24 1996-07-09 ノーザン・テレコム・リミテッド 2経路検索による音声認識方法
JPH10149192A (ja) * 1996-09-20 1998-06-02 Nippon Telegr & Teleph Corp <Ntt> パターン認識方法、装置およびその記憶媒体
JP2000261321A (ja) * 1999-03-09 2000-09-22 Mitsubishi Electric Corp 要素分布の探索方法,ベクトル量子化方法,パターン認識方法,音声認識方法,音声認識装置及び認識結果を決定するためのプログラムが記録された記録媒体
JP2004117503A (ja) * 2002-09-24 2004-04-15 Nippon Telegr & Teleph Corp <Ntt> 音声認識用音響モデル作成方法、その装置、そのプログラムおよびその記録媒体、上記音響モデルを用いる音声認識装置
JP2005004018A (ja) * 2003-06-13 2005-01-06 Mitsubishi Electric Corp 音声認識装置
WO2005010868A1 (ja) * 2003-07-29 2005-02-03 Mitsubishi Denki Kabushiki Kaisha 音声認識システム及びその端末とサーバ
JP2005234214A (ja) * 2004-02-19 2005-09-02 Nippon Telegr & Teleph Corp <Ntt> 音声認識用音響モデル生成方法及び装置、音声認識用音響モデル生成プログラムを記録した記録媒体
JP2006091864A (ja) * 2004-08-26 2006-04-06 Asahi Kasei Corp 音声認識装置、音声認識方法、及び、プログラム

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6301555B2 (en) * 1995-04-10 2001-10-09 Corporate Computer Systems Adjustable psycho-acoustic parameters
US5842163A (en) * 1995-06-21 1998-11-24 Sri International Method and apparatus for computing likelihood and hypothesizing keyword appearance in speech
DE69517705T2 (de) * 1995-11-04 2000-11-23 International Business Machines Corp., Armonk Verfahren und vorrichtung zur anpassung der grösse eines sprachmodells in einem spracherkennungssystem
FI100840B (fi) * 1995-12-12 1998-02-27 Nokia Mobile Phones Ltd Kohinanvaimennin ja menetelmä taustakohinan vaimentamiseksi kohinaises ta puheesta sekä matkaviestin
US6018708A (en) * 1997-08-26 2000-01-25 Nortel Networks Corporation Method and apparatus for performing speech recognition utilizing a supplementary lexicon of frequently used orthographies
US6208964B1 (en) * 1998-08-31 2001-03-27 Nortel Networks Limited Method and apparatus for providing unsupervised adaptation of transcriptions
JP2001075596A (ja) 1999-09-03 2001-03-23 Mitsubishi Electric Corp 音声認識装置、音声認識方法及び音声認識プログラムを記録した記録媒体
JP5118280B2 (ja) * 1999-10-19 2013-01-16 ソニー エレクトロニクス インク 自然言語インターフェースコントロールシステム
US8392188B1 (en) * 1999-11-05 2013-03-05 At&T Intellectual Property Ii, L.P. Method and system for building a phonotactic model for domain independent speech recognition
US6754626B2 (en) * 2001-03-01 2004-06-22 International Business Machines Corporation Creating a hierarchical tree of language models for a dialog system based on prompt and dialog context
US6839667B2 (en) * 2001-05-16 2005-01-04 International Business Machines Corporation Method of speech recognition by presenting N-best word candidates
US7103542B2 (en) * 2001-12-14 2006-09-05 Ben Franklin Patent Holding Llc Automatically improving a voice recognition system
US7292975B2 (en) * 2002-05-01 2007-11-06 Nuance Communications, Inc. Systems and methods for evaluating speaker suitability for automatic speech recognition aided transcription
EP1685554A1 (en) * 2003-10-09 2006-08-02 TEAC America, Inc. Method, apparatus, and system for synthesizing an audio performance using convolution at multiple sample rates
US7228278B2 (en) * 2004-07-06 2007-06-05 Voxify, Inc. Multi-slot dialog systems and methods
GB0420464D0 (en) * 2004-09-14 2004-10-20 Zentian Ltd A speech recognition circuit and method
US8234116B2 (en) * 2006-08-22 2012-07-31 Microsoft Corporation Calculating cost measures between HMM acoustic models

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0667698A (ja) * 1992-06-19 1994-03-11 Seiko Epson Corp 音声認識装置
JPH08506430A (ja) * 1993-06-24 1996-07-09 ノーザン・テレコム・リミテッド 2経路検索による音声認識方法
JPH10149192A (ja) * 1996-09-20 1998-06-02 Nippon Telegr & Teleph Corp <Ntt> パターン認識方法、装置およびその記憶媒体
JP2000261321A (ja) * 1999-03-09 2000-09-22 Mitsubishi Electric Corp 要素分布の探索方法,ベクトル量子化方法,パターン認識方法,音声認識方法,音声認識装置及び認識結果を決定するためのプログラムが記録された記録媒体
JP2004117503A (ja) * 2002-09-24 2004-04-15 Nippon Telegr & Teleph Corp <Ntt> 音声認識用音響モデル作成方法、その装置、そのプログラムおよびその記録媒体、上記音響モデルを用いる音声認識装置
JP2005004018A (ja) * 2003-06-13 2005-01-06 Mitsubishi Electric Corp 音声認識装置
WO2005010868A1 (ja) * 2003-07-29 2005-02-03 Mitsubishi Denki Kabushiki Kaisha 音声認識システム及びその端末とサーバ
JP2005234214A (ja) * 2004-02-19 2005-09-02 Nippon Telegr & Teleph Corp <Ntt> 音声認識用音響モデル生成方法及び装置、音声認識用音響モデル生成プログラムを記録した記録媒体
JP2006091864A (ja) * 2004-08-26 2006-04-06 Asahi Kasei Corp 音声認識装置、音声認識方法、及び、プログラム

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010204274A (ja) * 2009-03-02 2010-09-16 Toshiba Corp 音声認識装置、その方法及びそのプログラム

Also Published As

Publication number Publication date
US20100070277A1 (en) 2010-03-18
CN101622660A (zh) 2010-01-06
JPWO2008108232A1 (ja) 2010-06-10
US8612225B2 (en) 2013-12-17
JP5229216B2 (ja) 2013-07-03

Similar Documents

Publication Publication Date Title
WO2008108232A1 (ja) 音声認識装置、音声認識方法及び音声認識プログラム
US11430428B2 (en) Method, apparatus, and storage medium for segmenting sentences for speech recognition
HK1222726A1 (zh) 智能自动化助理
WO2013066409A8 (en) System, method and program for customized voice communication
DE502007004737D1 (de) Verfahren zur Speicherung einer Messreihe
WO2008142836A1 (ja) 声質変換装置および声質変換方法
KR20180084394A (ko) 발화 완료 감지 방법 및 이를 구현한 전자 장치
EP4236281A3 (en) Event-triggered hands-free multitasking for media playback
WO2013003772A3 (en) Speech recognition using variable-length context
EP2324896A3 (en) Motion determining apparatus and storage medium having motion determining program stored thereon
EP2306345A3 (en) Speech retrieval apparatus and speech retrieval method
EP4235649A3 (en) Language model biasing
WO2007118100A3 (en) Automatic language model update
WO2010003109A3 (en) Speech recognition with parallel recognition tasks
JP2016526178A5 (ja)
WO2012047036A3 (en) Apparatus and method for adaptive gesture recognition in portable terminal
WO2007095591A3 (en) Voice command interface device
WO2013005248A1 (ja) 音声認識装置およびナビゲーション装置
EP2136286A3 (en) System and method for automatically producing haptic events from a digital audio file
WO2013134106A3 (en) Device for extracting information from a dialog
WO2008136191A1 (ja) 入力装置、携帯型情報端末、および入力方法
WO2009035281A3 (en) Refrigerator
WO2016139670A8 (en) System and method for generating accurate speech transcription from natural speech audio signals
US20160004501A1 (en) Audio command intent determination system and method
FI20095714A7 (fi) Ajoreitin määrittäminen liikkuvan kaivoskoneen automaattisen ohjaamisen järjestämiseksi

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200880006579.5

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08712015

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 12528022

Country of ref document: US

ENP Entry into the national phase

Ref document number: 2009502533

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08712015

Country of ref document: EP

Kind code of ref document: A1