AU2001275991A1 - System and method for voice recognition with a plurality of voice recognition engines - Google Patents
System and method for voice recognition with a plurality of voice recognition enginesInfo
- Publication number
- AU2001275991A1 AU2001275991A1 AU2001275991A AU7599101A AU2001275991A1 AU 2001275991 A1 AU2001275991 A1 AU 2001275991A1 AU 2001275991 A AU2001275991 A AU 2001275991A AU 7599101 A AU7599101 A AU 7599101A AU 2001275991 A1 AU2001275991 A1 AU 2001275991A1
- Authority
- AU
- Australia
- Prior art keywords
- voice recognition
- engine
- results
- engines
- dtw
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 230000001419 dependent effect Effects 0.000 abstract 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/32—Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Telephonic Communication Services (AREA)
- Selective Calling Equipment (AREA)
- Fittings On The Vehicle Exterior For Carrying Loads, And Devices For Holding Or Mounting Articles (AREA)
- Machine Translation (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
A method and system that combines voice recognition engines and resolves any differences between the results of individual voice recognition engines. A speaker independent (SI) Hidden Markov Model (HMM) engine, a speaker independent Dynamic Time Warping (DTW-SI) engine and a speaker dependent Dynamic Time Warping (DTW-SD) engine are combined. Combining and resolving the results of these engines results in a system with better recognition accuracy and lower rejection rates than using the results of only one engine.
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US09/618,177 US6671669B1 (en) | 2000-07-18 | 2000-07-18 | combined engine system and method for voice recognition |
| US09618177 | 2000-07-18 | ||
| PCT/US2001/022761 WO2002007148A1 (en) | 2000-07-18 | 2001-07-17 | System and method for voice recognition with a plurality of voice recognition engines |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| AU2001275991A1 true AU2001275991A1 (en) | 2002-01-30 |
Family
ID=24476623
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| AU2001275991A Abandoned AU2001275991A1 (en) | 2000-07-18 | 2001-07-17 | System and method for voice recognition with a plurality of voice recognition engines |
Country Status (9)
| Country | Link |
|---|---|
| US (1) | US6671669B1 (en) |
| EP (1) | EP1301922B1 (en) |
| CN (1) | CN1188831C (en) |
| AT (1) | ATE349751T1 (en) |
| AU (1) | AU2001275991A1 (en) |
| DE (1) | DE60125542T2 (en) |
| ES (1) | ES2278763T3 (en) |
| TW (1) | TWI253056B (en) |
| WO (1) | WO2002007148A1 (en) |
Families Citing this family (59)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7003463B1 (en) | 1998-10-02 | 2006-02-21 | International Business Machines Corporation | System and method for providing network coordinated conversational services |
| US6754629B1 (en) * | 2000-09-08 | 2004-06-22 | Qualcomm Incorporated | System and method for automatic voice recognition using mapping |
| US20030004720A1 (en) * | 2001-01-30 | 2003-01-02 | Harinath Garudadri | System and method for computing and transmitting parameters in a distributed voice recognition system |
| US20020143540A1 (en) * | 2001-03-28 | 2002-10-03 | Narendranath Malayath | Voice recognition system using implicit speaker adaptation |
| US7941313B2 (en) | 2001-05-17 | 2011-05-10 | Qualcomm Incorporated | System and method for transmitting speech activity information ahead of speech features in a distributed voice recognition system |
| US7203643B2 (en) * | 2001-06-14 | 2007-04-10 | Qualcomm Incorporated | Method and apparatus for transmitting speech activity in distributed voice recognition systems |
| US7366673B2 (en) * | 2001-06-15 | 2008-04-29 | International Business Machines Corporation | Selective enablement of speech recognition grammars |
| TW541517B (en) * | 2001-12-25 | 2003-07-11 | Univ Nat Cheng Kung | Speech recognition system |
| US6996526B2 (en) * | 2002-01-02 | 2006-02-07 | International Business Machines Corporation | Method and apparatus for transcribing speech when a plurality of speakers are participating |
| US7203652B1 (en) * | 2002-02-21 | 2007-04-10 | Nuance Communications | Method and system for improving robustness in a speech system |
| JP4304952B2 (en) * | 2002-10-07 | 2009-07-29 | 三菱電機株式会社 | On-vehicle controller and program for causing computer to execute operation explanation method thereof |
| US20040138885A1 (en) * | 2003-01-09 | 2004-07-15 | Xiaofan Lin | Commercial automatic speech recognition engine combinations |
| JP3678421B2 (en) * | 2003-02-19 | 2005-08-03 | 松下電器産業株式会社 | Speech recognition apparatus and speech recognition method |
| US7523097B1 (en) | 2004-01-13 | 2009-04-21 | Juniper Networks, Inc. | Restoration of archived configurations for a network device |
| KR100693284B1 (en) * | 2005-04-14 | 2007-03-13 | 학교법인 포항공과대학교 | Speech recognition device |
| CN1963918A (en) * | 2005-11-11 | 2007-05-16 | 株式会社东芝 | Compress of speaker cyclostyle, combination apparatus and method and authentication of speaker |
| US7970613B2 (en) | 2005-11-12 | 2011-06-28 | Sony Computer Entertainment Inc. | Method and system for Gaussian probability data bit reduction and computation |
| US7778831B2 (en) | 2006-02-21 | 2010-08-17 | Sony Computer Entertainment Inc. | Voice recognition with dynamic filter bank adjustment based on speaker categorization determined from runtime pitch |
| US8010358B2 (en) | 2006-02-21 | 2011-08-30 | Sony Computer Entertainment Inc. | Voice recognition with parallel gender and age normalization |
| US8532984B2 (en) | 2006-07-31 | 2013-09-10 | Qualcomm Incorporated | Systems, methods, and apparatus for wideband encoding and decoding of active frames |
| GB0616070D0 (en) * | 2006-08-12 | 2006-09-20 | Ibm | Speech Recognition Feedback |
| US8239190B2 (en) | 2006-08-22 | 2012-08-07 | Qualcomm Incorporated | Time-warping frames of wideband vocoder |
| US7813922B2 (en) * | 2007-01-30 | 2010-10-12 | Nokia Corporation | Audio quantization |
| WO2008105263A1 (en) * | 2007-02-28 | 2008-09-04 | Nec Corporation | Weight coefficient learning system and audio recognition system |
| US7904410B1 (en) * | 2007-04-18 | 2011-03-08 | The Mathworks, Inc. | Constrained dynamic time warping |
| US8352265B1 (en) | 2007-12-24 | 2013-01-08 | Edward Lin | Hardware implemented backend search engine for a high-rate speech recognition system |
| US8639510B1 (en) | 2007-12-24 | 2014-01-28 | Kai Yu | Acoustic scoring unit implemented on a single FPGA or ASIC |
| US8463610B1 (en) | 2008-01-18 | 2013-06-11 | Patrick J. Bourke | Hardware-implemented scalable modular engine for low-power speech recognition |
| WO2010019831A1 (en) * | 2008-08-14 | 2010-02-18 | 21Ct, Inc. | Hidden markov model for speech processing with training method |
| US8788256B2 (en) | 2009-02-17 | 2014-07-22 | Sony Computer Entertainment Inc. | Multiple language voice recognition |
| US8442833B2 (en) | 2009-02-17 | 2013-05-14 | Sony Computer Entertainment Inc. | Speech processing with source location estimation using signals from two or more microphones |
| US8442829B2 (en) | 2009-02-17 | 2013-05-14 | Sony Computer Entertainment Inc. | Automatic computation streaming partition for voice recognition on multiple processors with limited memory |
| US8417526B2 (en) * | 2009-03-13 | 2013-04-09 | Adacel, Inc. | Speech recognition learning system and method |
| US9026444B2 (en) | 2009-09-16 | 2015-05-05 | At&T Intellectual Property I, L.P. | System and method for personalization of acoustic models for automatic speech recognition |
| US8812321B2 (en) * | 2010-09-30 | 2014-08-19 | At&T Intellectual Property I, L.P. | System and method for combining speech recognition outputs from a plurality of domain-specific speech recognizers via machine learning |
| US20120168331A1 (en) * | 2010-12-30 | 2012-07-05 | Safecode Drug Technologies Corp. | Voice template protector for administering medicine |
| US8930194B2 (en) * | 2011-01-07 | 2015-01-06 | Nuance Communications, Inc. | Configurable speech recognition system using multiple recognizers |
| US9153235B2 (en) | 2012-04-09 | 2015-10-06 | Sony Computer Entertainment Inc. | Text dependent speaker recognition with long-term feature based on functional data analysis |
| KR20150063423A (en) | 2012-10-04 | 2015-06-09 | 뉘앙스 커뮤니케이션즈, 인코포레이티드 | Improved hybrid controller for asr |
| US9240184B1 (en) * | 2012-11-15 | 2016-01-19 | Google Inc. | Frame-level combination of deep neural network and gaussian mixture models |
| US9761228B2 (en) * | 2013-02-25 | 2017-09-12 | Mitsubishi Electric Corporation | Voice recognition system and voice recognition device |
| US20140337030A1 (en) * | 2013-05-07 | 2014-11-13 | Qualcomm Incorporated | Adaptive audio frame processing for keyword detection |
| CN104143330A (en) * | 2013-05-07 | 2014-11-12 | 佳能株式会社 | Voice recognizing method and voice recognizing system |
| US9225879B2 (en) * | 2013-12-27 | 2015-12-29 | TCL Research America Inc. | Method and apparatus for video sequential alignment |
| WO2016030568A1 (en) * | 2014-08-28 | 2016-03-03 | Nokia Technologies Oy | Audio parameter quantization |
| CN104616653B (en) * | 2015-01-23 | 2018-02-23 | 北京云知声信息技术有限公司 | Wake up word matching process, device and voice awakening method, device |
| US10134425B1 (en) * | 2015-06-29 | 2018-11-20 | Amazon Technologies, Inc. | Direction-based speech endpointing |
| US10536464B2 (en) * | 2016-06-22 | 2020-01-14 | Intel Corporation | Secure and smart login engine |
| US10971157B2 (en) | 2017-01-11 | 2021-04-06 | Nuance Communications, Inc. | Methods and apparatus for hybrid speech recognition processing |
| US10607601B2 (en) * | 2017-05-11 | 2020-03-31 | International Business Machines Corporation | Speech recognition by selecting and refining hot words |
| CN109285548A (en) * | 2017-07-19 | 2019-01-29 | 阿里巴巴集团控股有限公司 | Information processing method, system, electronic equipment and computer storage medium |
| GB2566760B (en) | 2017-10-20 | 2019-10-23 | Please Hold Uk Ltd | Audio Signal |
| GB2566759B8 (en) | 2017-10-20 | 2021-12-08 | Please Hold Uk Ltd | Encoding identifiers to produce audio identifiers from a plurality of audio bitstreams |
| US12131228B2 (en) | 2019-04-02 | 2024-10-29 | International Business Machines Corporation | Method for accessing data records of a master data management system |
| CN111128154B (en) * | 2019-12-03 | 2022-06-03 | 杭州蓦然认知科技有限公司 | Method and device for forming interaction engine cluster by aggregation |
| CN111694331B (en) * | 2020-05-11 | 2021-11-02 | 杭州睿疆科技有限公司 | System, method and computer equipment for adjusting production process parameters |
| US11664033B2 (en) * | 2020-06-15 | 2023-05-30 | Samsung Electronics Co., Ltd. | Electronic apparatus and controlling method thereof |
| US11996087B2 (en) | 2021-04-30 | 2024-05-28 | Comcast Cable Communications, Llc | Method and apparatus for intelligent voice recognition |
| CN115376513B (en) * | 2022-10-19 | 2023-05-12 | 广州小鹏汽车科技有限公司 | Voice interaction method, server and computer readable storage medium |
Family Cites Families (15)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4587670A (en) * | 1982-10-15 | 1986-05-06 | At&T Bell Laboratories | Hidden Markov model speech recognition arrangement |
| US4783804A (en) * | 1985-03-21 | 1988-11-08 | American Telephone And Telegraph Company, At&T Bell Laboratories | Hidden Markov model speech recognition arrangement |
| US4852180A (en) * | 1987-04-03 | 1989-07-25 | American Telephone And Telegraph Company, At&T Bell Laboratories | Speech recognition by acoustic/phonetic system and technique |
| US5167004A (en) * | 1991-02-28 | 1992-11-24 | Texas Instruments Incorporated | Temporal decorrelation method for robust speaker verification |
| US5222190A (en) * | 1991-06-11 | 1993-06-22 | Texas Instruments Incorporated | Apparatus and method for identifying a speech pattern |
| BR9206143A (en) * | 1991-06-11 | 1995-01-03 | Qualcomm Inc | Vocal end compression processes and for variable rate encoding of input frames, apparatus to compress an acoustic signal into variable rate data, prognostic encoder triggered by variable rate code (CELP) and decoder to decode encoded frames |
| US5450522A (en) * | 1991-08-19 | 1995-09-12 | U S West Advanced Technologies, Inc. | Auditory model for parametrization of speech |
| CA2126380C (en) * | 1993-07-22 | 1998-07-07 | Wu Chou | Minimum error rate training of combined string models |
| US5839103A (en) * | 1995-06-07 | 1998-11-17 | Rutgers, The State University Of New Jersey | Speaker verification system using decision fusion logic |
| US5754978A (en) * | 1995-10-27 | 1998-05-19 | Speech Systems Of Colorado, Inc. | Speech recognition system |
| US5819220A (en) * | 1996-09-30 | 1998-10-06 | Hewlett-Packard Company | Web triggered word set boosting for speech interfaces to the world wide web |
| US6003002A (en) * | 1997-01-02 | 1999-12-14 | Texas Instruments Incorporated | Method and system of adapting speech recognition models to speaker environment |
| US5893059A (en) * | 1997-04-17 | 1999-04-06 | Nynex Science And Technology, Inc. | Speech recoginition methods and apparatus |
| US6014624A (en) * | 1997-04-18 | 2000-01-11 | Nynex Science And Technology, Inc. | Method and apparatus for transitioning from one voice recognition system to another |
| US6526380B1 (en) | 1999-03-26 | 2003-02-25 | Koninklijke Philips Electronics N.V. | Speech recognition system having parallel large vocabulary recognition engines |
-
2000
- 2000-07-18 US US09/618,177 patent/US6671669B1/en not_active Expired - Lifetime
-
2001
- 2001-07-17 AT AT01953554T patent/ATE349751T1/en not_active IP Right Cessation
- 2001-07-17 WO PCT/US2001/022761 patent/WO2002007148A1/en not_active Ceased
- 2001-07-17 AU AU2001275991A patent/AU2001275991A1/en not_active Abandoned
- 2001-07-17 EP EP01953554A patent/EP1301922B1/en not_active Expired - Lifetime
- 2001-07-17 ES ES01953554T patent/ES2278763T3/en not_active Expired - Lifetime
- 2001-07-17 CN CNB018145922A patent/CN1188831C/en not_active Expired - Fee Related
- 2001-07-17 DE DE60125542T patent/DE60125542T2/en not_active Expired - Lifetime
- 2001-07-18 TW TW090117578A patent/TWI253056B/en not_active IP Right Cessation
Also Published As
| Publication number | Publication date |
|---|---|
| US6671669B1 (en) | 2003-12-30 |
| DE60125542D1 (en) | 2007-02-08 |
| ATE349751T1 (en) | 2007-01-15 |
| EP1301922B1 (en) | 2006-12-27 |
| CN1454380A (en) | 2003-11-05 |
| ES2278763T3 (en) | 2007-08-16 |
| CN1188831C (en) | 2005-02-09 |
| TWI253056B (en) | 2006-04-11 |
| EP1301922A1 (en) | 2003-04-16 |
| WO2002007148A1 (en) | 2002-01-24 |
| DE60125542T2 (en) | 2007-10-11 |
| HK1057816A1 (en) | 2004-04-16 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| AU2001275991A1 (en) | System and method for voice recognition with a plurality of voice recognition engines | |
| DE60124408D1 (en) | COMBINATION OF DIGITAL TIME SHIFTING AND HMM IN SPEAKER DEPENDENCE AND SPEAKER INDEPENDENT WAY FOR LANGUAGE RECOGNITION | |
| WO2004100638A3 (en) | Source-dependent text-to-speech system | |
| WO2004090866A3 (en) | Phonetically based speech recognition system and method | |
| AU7339000A (en) | A system, method, and article of manufacture for detecting emotion in voice signals through analysis of a plurality of voice signal parameters | |
| EP0881625A3 (en) | Multiple models integration for multi-environment speech recognition | |
| WO2008142836A1 (en) | Voice tone converting device and voice tone converting method | |
| WO2006023631A3 (en) | Document transcription system training | |
| WO2004003688A8 (en) | A method for comparing a transcribed text file with a previously created file | |
| TW340938B (en) | Dialog device with voice identification capabilty | |
| EP0977174A3 (en) | Search optimization system and method for continuous speech recognition | |
| ATE407420T1 (en) | DISTRIBUTED SPEECH RECOGNITION SYSTEM USING ACOUSTIC FEATURE VECTOR MODIFICATION | |
| ATE421136T1 (en) | AUDIOVISUAL VOICE ACTIVITY DETECTION FOR A VOICE RECOGNITION SYSTEM | |
| WO1998034216A3 (en) | System and method for detecting a recorded voice | |
| ATE246835T1 (en) | SPEAKER RECOGNITION | |
| GB2437040A (en) | Underwater sound projector system and method of producing same | |
| EP0852374A3 (en) | Method and system for speaker-independent recognition of user-defined phrases | |
| WO2006070373A3 (en) | A system and a method for representing unrecognized words in speech to text conversions as syllables | |
| EP4303796A3 (en) | Meeting-adapted language model for speech recognition | |
| WO2006082868A3 (en) | Method and system for identifying speech sound and non-speech sound in an environment | |
| WO2007018802A3 (en) | Method and system for operation of a voice activity detector | |
| WO2002079744A3 (en) | Sound characterisation and/or identification based on prosodic listening | |
| CA2564760A1 (en) | Speech analysis using statistical learning | |
| DE69916297D1 (en) | INTERMEDIATE CONNECTION PHONEMIC MODELS | |
| DE59902143D1 (en) | METHOD AND DEVICE FOR OUTPUTING INFORMATION AND / OR MESSAGES BY VOICE |