[go: up one dir, main page]

WO2020113935A1 - Procédé et appareil d'augmentation de taux de réussite d'activation vocale et support d'informations - Google Patents

Procédé et appareil d'augmentation de taux de réussite d'activation vocale et support d'informations Download PDF

Info

Publication number
WO2020113935A1
WO2020113935A1 PCT/CN2019/091258 CN2019091258W WO2020113935A1 WO 2020113935 A1 WO2020113935 A1 WO 2020113935A1 CN 2019091258 W CN2019091258 W CN 2019091258W WO 2020113935 A1 WO2020113935 A1 WO 2020113935A1
Authority
WO
WIPO (PCT)
Prior art keywords
wake
voice
score
word
voice signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2019/091258
Other languages
English (en)
Chinese (zh)
Inventor
关海欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yunzhisheng Information Technology Co Ltd
Original Assignee
Beijing Yunzhisheng Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yunzhisheng Information Technology Co Ltd filed Critical Beijing Yunzhisheng Information Technology Co Ltd
Publication of WO2020113935A1 publication Critical patent/WO2020113935A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • G10L17/24Interactive procedures; Man-machine interfaces the user being prompted to utter a password or a predefined phrase
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Definitions

  • step (1) the speech signal is sequentially analyzed and scored through the neural network. Specifically, the neural network first converts the speech signal into corresponding data information, and then performs correlation calculation processing on the data information and specific words, and Based on the results of the correlation calculation process, a score is obtained.
  • step (2) calculating the steering vector of the voice signal includes directly calculating the steering vector based on the start and end time points of the generation of the wake word and the data segment corresponding to the wake word, or first calculating the azimuth of the data segment, according to the azimuth The angle calculates the steering vector.
  • Step (2) based on the starting and ending time points of the generation of wake-up words, extract the voice signal corresponding to the wake-up word from the multi-channel cache, and calculate the steering vector of the voice signal;

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)

Abstract

La présente invention concerne un procédé et à un appareil d'augmentation de taux de réussite d'activation vocale et un support d'informations. Le procédé sert à augmenter le taux de réussite afin d'effectuer une opération d'activation vocale sur un dispositif terminal à l'état de veille. Une activation vocale originale et un traitement de signal de réseau de microphones, qui sont relativement indépendants et qui ne sont pas corrélés, sont combinés organiquement et une boucle de rétroaction en boucle fermée est construite par corrélation d'informations respectives des deux ; au moyen de la boucle de rétroaction en boucle fermée, l'activation vocale fournit une plage de données de signal vraie et précise destinée au traitement de signal de réseau de microphones, afin que le traitement de signal de réseau de microphones obtienne des informations de statistiques précises concernant des signaux et des bruits, des données vocales dont un bruit d'interférence est éliminé étant transmises à un moteur d'activation afin de pouvoir obtenir un résultat d'activation précis et rapide.
PCT/CN2019/091258 2018-12-03 2019-06-14 Procédé et appareil d'augmentation de taux de réussite d'activation vocale et support d'informations Ceased WO2020113935A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811466502.8A CN109461456B (zh) 2018-12-03 2018-12-03 一种提升语音唤醒成功率的方法
CN201811466502.8 2018-12-03

Publications (1)

Publication Number Publication Date
WO2020113935A1 true WO2020113935A1 (fr) 2020-06-11

Family

ID=65612332

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/091258 Ceased WO2020113935A1 (fr) 2018-12-03 2019-06-14 Procédé et appareil d'augmentation de taux de réussite d'activation vocale et support d'informations

Country Status (2)

Country Link
CN (1) CN109461456B (fr)
WO (1) WO2020113935A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112365883A (zh) * 2020-10-29 2021-02-12 安徽江淮汽车集团股份有限公司 座舱系统语音识别测试方法、装置、设备及存储介质
CN113223518A (zh) * 2021-04-16 2021-08-06 讯飞智联科技(江苏)有限公司 一种基于ai语音分析的边缘计算网关的人机互动的方法

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109461456B (zh) * 2018-12-03 2022-03-22 云知声智能科技股份有限公司 一种提升语音唤醒成功率的方法
CN109979185B (zh) * 2019-04-11 2020-08-14 杭州微纳科技股份有限公司 一种远场语音输入装置
JP7191792B2 (ja) * 2019-08-23 2022-12-19 株式会社東芝 情報処理装置、情報処理方法およびプログラム
CN111613211B (zh) * 2020-04-17 2023-04-07 云知声智能科技股份有限公司 特定词语音的处理方法及装置
CN112259108B (zh) * 2020-09-27 2024-05-31 中国科学技术大学 一种引擎响应时间的分析方法及电子设备、存储介质
CN112562666B (zh) * 2020-11-30 2022-11-04 海信视像科技股份有限公司 一种筛选设备的方法及服务设备
CN112466304B (zh) * 2020-12-03 2023-09-08 北京百度网讯科技有限公司 离线语音交互方法、装置、系统、设备和存储介质
CN114863936B (zh) * 2021-01-20 2025-05-16 华为技术有限公司 一种唤醒方法及电子设备
CN113160823B (zh) * 2021-05-26 2024-05-17 中国工商银行股份有限公司 基于脉冲神经网络的语音唤醒方法、装置及电子设备
CN115588435A (zh) * 2022-11-08 2023-01-10 荣耀终端有限公司 语音唤醒方法及电子设备
CN117575936B (zh) * 2023-11-07 2024-07-19 浙江大学 一种基于通道相关性的磁共振图像去噪方法、装置及设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106782563A (zh) * 2016-12-28 2017-05-31 上海百芝龙网络科技有限公司 一种智能家居语音交互系统
CN107172018A (zh) * 2017-04-27 2017-09-15 华南理工大学 公共背景噪声下激活式的声纹密码安全控制方法及系统
US20180102125A1 (en) * 2016-10-12 2018-04-12 Samsung Electronics Co., Ltd. Electronic device and method for controlling the same
WO2018086033A1 (fr) * 2016-11-10 2018-05-17 Nuance Communications, Inc. Techniques de détection de mot de mise en route indépendant de la langue
CN108122563A (zh) * 2017-12-19 2018-06-05 北京声智科技有限公司 提高语音唤醒率及修正doa的方法
CN109461456A (zh) * 2018-12-03 2019-03-12 北京云知声信息技术有限公司 一种提升语音唤醒成功率的方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN202794508U (zh) * 2012-09-07 2013-03-13 南京理工大学 应用于救援的基于麦克风阵列的语音定位装置
GB201506046D0 (en) * 2015-04-09 2015-05-27 Sinvent As Speech recognition
CN104936091B (zh) * 2015-05-14 2018-06-15 讯飞智元信息科技有限公司 基于圆形麦克风阵列的智能交互方法及系统
CN107591151B (zh) * 2017-08-22 2021-03-16 百度在线网络技术(北京)有限公司 远场语音唤醒方法、装置和终端设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180102125A1 (en) * 2016-10-12 2018-04-12 Samsung Electronics Co., Ltd. Electronic device and method for controlling the same
WO2018086033A1 (fr) * 2016-11-10 2018-05-17 Nuance Communications, Inc. Techniques de détection de mot de mise en route indépendant de la langue
CN106782563A (zh) * 2016-12-28 2017-05-31 上海百芝龙网络科技有限公司 一种智能家居语音交互系统
CN107172018A (zh) * 2017-04-27 2017-09-15 华南理工大学 公共背景噪声下激活式的声纹密码安全控制方法及系统
CN108122563A (zh) * 2017-12-19 2018-06-05 北京声智科技有限公司 提高语音唤醒率及修正doa的方法
CN109461456A (zh) * 2018-12-03 2019-03-12 北京云知声信息技术有限公司 一种提升语音唤醒成功率的方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112365883A (zh) * 2020-10-29 2021-02-12 安徽江淮汽车集团股份有限公司 座舱系统语音识别测试方法、装置、设备及存储介质
CN112365883B (zh) * 2020-10-29 2023-12-26 安徽江淮汽车集团股份有限公司 座舱系统语音识别测试方法、装置、设备及存储介质
CN113223518A (zh) * 2021-04-16 2021-08-06 讯飞智联科技(江苏)有限公司 一种基于ai语音分析的边缘计算网关的人机互动的方法
CN113223518B (zh) * 2021-04-16 2024-03-22 讯飞智联科技(江苏)有限公司 一种基于ai语音分析的边缘计算网关的人机互动的方法

Also Published As

Publication number Publication date
CN109461456B (zh) 2022-03-22
CN109461456A (zh) 2019-03-12

Similar Documents

Publication Publication Date Title
WO2020113935A1 (fr) Procédé et appareil d'augmentation de taux de réussite d'activation vocale et support d'informations
US20210304735A1 (en) Keyword detection method and related apparatus
WO2020083110A1 (fr) Procédé et appareil de reconnaissance de la parole et d'apprentissage de modèle de reconnaissance de la parole
CN113160815B (zh) 语音唤醒的智能控制方法、装置、设备及存储介质
CN107464565B (zh) 一种远场语音唤醒方法及设备
CN111223497A (zh) 一种终端的就近唤醒方法、装置、计算设备及存储介质
WO2020228270A1 (fr) Procédé et dispositif de traitement vocal, dispositif informatique et support de stockage
WO2020088153A1 (fr) Procédé et appareil de traitement de la parole, support de stockage et dispositif électronique
CN110554357B (zh) 声源定位方法和装置
WO2020103703A1 (fr) Procédé et appareil de traitement de données audio, dispositif et support de stockage
CN102938254A (zh) 一种语音信号增强系统和方法
CN111627455B (zh) 一种音频数据降噪方法、装置以及计算机可读存储介质
CN110517677B (zh) 语音处理系统、方法、设备、语音识别系统及存储介质
WO2023273747A1 (fr) Procédé et appareil de réveil pour dispositif intelligent, support de stockage et dispositif électronique
WO2019071723A1 (fr) Procédé et dispositif de traduction de parole-à-parole et machine de traduction
CN109270493A (zh) 声源定位方法和装置
CN111722696B (zh) 用于低功耗设备的语音数据处理方法和装置
CN110517702B (zh) 信号生成的方法、基于人工智能的语音识别方法及装置
WO2016124048A1 (fr) Dispositif électronique et procédé de démarrage de programme d'application
CN112562742A (zh) 语音处理方法和装置
US12154584B2 (en) Method and apparatus for noise reduction, electronic device, and storage medium
CN110415718B (zh) 信号生成的方法、基于人工智能的语音识别方法及装置
JP7215417B2 (ja) 情報処理装置、情報処理方法、およびプログラム
CN112740219A (zh) 手势识别模型的生成方法、装置、存储介质及电子设备
CN110364159B (zh) 一种语音指令的执行方法、装置及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19893458

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19893458

Country of ref document: EP

Kind code of ref document: A1