CN110706692B - 儿童语音识别模型的训练方法及系统 - Google Patents
儿童语音识别模型的训练方法及系统 Download PDFInfo
- Publication number
- CN110706692B CN110706692B CN201911000370.4A CN201911000370A CN110706692B CN 110706692 B CN110706692 B CN 110706692B CN 201911000370 A CN201911000370 A CN 201911000370A CN 110706692 B CN110706692 B CN 110706692B
- Authority
- CN
- China
- Prior art keywords
- training
- data
- child
- training data
- noise
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/70—Circuitry for compensating brightness variation in the scene
- H04N23/71—Circuitry for evaluating the brightness variation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
Description
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201911000370.4A CN110706692B (zh) | 2019-10-21 | 2019-10-21 | 儿童语音识别模型的训练方法及系统 |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201911000370.4A CN110706692B (zh) | 2019-10-21 | 2019-10-21 | 儿童语音识别模型的训练方法及系统 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN110706692A CN110706692A (zh) | 2020-01-17 |
| CN110706692B true CN110706692B (zh) | 2021-12-14 |
Family
ID=69201956
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201911000370.4A Active CN110706692B (zh) | 2019-10-21 | 2019-10-21 | 儿童语音识别模型的训练方法及系统 |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN110706692B (zh) |
Families Citing this family (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111508505B (zh) * | 2020-04-28 | 2023-11-03 | 讯飞智元信息科技有限公司 | 一种说话人识别方法、装置、设备及存储介质 |
| CN111540345B (zh) * | 2020-05-09 | 2022-06-24 | 北京大牛儿科技发展有限公司 | 一种弱监督语音识别模型训练方法及装置 |
| CN111986659B (zh) * | 2020-07-16 | 2024-08-06 | 百度在线网络技术(北京)有限公司 | 建立音频生成模型的方法以及装置 |
| CN111899759B (zh) * | 2020-07-27 | 2021-09-03 | 北京嘀嘀无限科技发展有限公司 | 音频数据的预训练、模型训练方法、装置、设备及介质 |
| CN112102816A (zh) * | 2020-08-17 | 2020-12-18 | 北京百度网讯科技有限公司 | 语音识别方法、装置、系统、电子设备和存储介质 |
| CN112545532B (zh) * | 2020-11-26 | 2023-05-16 | 中国人民解放军战略支援部队信息工程大学 | 用于脑电信号分类识别的数据增强方法及系统 |
| CN112530401B (zh) * | 2020-11-30 | 2024-05-03 | 清华珠三角研究院 | 一种语音合成方法、系统及装置 |
| CN112509600A (zh) * | 2020-12-11 | 2021-03-16 | 平安科技(深圳)有限公司 | 模型的训练方法、装置、语音转换方法、设备及存储介质 |
| CN112634860B (zh) * | 2020-12-29 | 2022-05-03 | 思必驰科技股份有限公司 | 儿童语音识别模型训练语料筛选方法 |
| CN112820324B (zh) * | 2020-12-31 | 2024-06-25 | 平安科技(深圳)有限公司 | 多标签语音活动检测方法、装置及存储介质 |
| US11908453B2 (en) | 2021-02-10 | 2024-02-20 | Direct Cursus Technology L.L.C | Method and system for classifying a user of an electronic device |
| CN113160855B (zh) * | 2021-05-28 | 2022-10-21 | 思必驰科技股份有限公司 | 在线语音活性检测系统改进方法和装置 |
| CN115132185A (zh) * | 2022-06-29 | 2022-09-30 | 中国银行股份有限公司 | 一种语音识别模型训练、语音识别方法及相关设备 |
| CN115206298A (zh) * | 2022-07-08 | 2022-10-18 | 蔚来汽车科技(安徽)有限公司 | 基于基频语速同步约束的儿童语音数据的增广方法 |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2016114428A1 (ko) * | 2015-01-16 | 2016-07-21 | 삼성전자 주식회사 | 문법 모델을 이용하여 음성인식을 수행하는 방법 및 디바이스 |
| EP3340239A1 (en) * | 2016-12-23 | 2018-06-27 | Samsung Electronics Co., Ltd. | Electronic device and speech recognition method therefor |
| CN108922518A (zh) * | 2018-07-18 | 2018-11-30 | 苏州思必驰信息科技有限公司 | 语音数据扩增方法和系统 |
| US10152970B1 (en) * | 2018-02-08 | 2018-12-11 | Capital One Services, Llc | Adversarial learning and generation of dialogue responses |
| CN109741736A (zh) * | 2017-10-27 | 2019-05-10 | 百度(美国)有限责任公司 | 使用生成对抗网络进行鲁棒语音识别的系统和方法 |
| CN110211575A (zh) * | 2019-06-13 | 2019-09-06 | 苏州思必驰信息科技有限公司 | 用于数据增强的语音加噪方法及系统 |
| CN110246489A (zh) * | 2019-06-14 | 2019-09-17 | 苏州思必驰信息科技有限公司 | 用于儿童的语音识别方法及系统 |
-
2019
- 2019-10-21 CN CN201911000370.4A patent/CN110706692B/zh active Active
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2016114428A1 (ko) * | 2015-01-16 | 2016-07-21 | 삼성전자 주식회사 | 문법 모델을 이용하여 음성인식을 수행하는 방법 및 디바이스 |
| EP3340239A1 (en) * | 2016-12-23 | 2018-06-27 | Samsung Electronics Co., Ltd. | Electronic device and speech recognition method therefor |
| CN109741736A (zh) * | 2017-10-27 | 2019-05-10 | 百度(美国)有限责任公司 | 使用生成对抗网络进行鲁棒语音识别的系统和方法 |
| US10152970B1 (en) * | 2018-02-08 | 2018-12-11 | Capital One Services, Llc | Adversarial learning and generation of dialogue responses |
| CN108922518A (zh) * | 2018-07-18 | 2018-11-30 | 苏州思必驰信息科技有限公司 | 语音数据扩增方法和系统 |
| CN110211575A (zh) * | 2019-06-13 | 2019-09-06 | 苏州思必驰信息科技有限公司 | 用于数据增强的语音加噪方法及系统 |
| CN110246489A (zh) * | 2019-06-14 | 2019-09-17 | 苏州思必驰信息科技有限公司 | 用于儿童的语音识别方法及系统 |
Non-Patent Citations (7)
| Title |
|---|
| Data Augmentation using Conditional Generative Adversarial Networks for Robust Speech Recognition;Peiyao Sheng et al;《ISCSLP 2018》;20190506;全文 * |
| Data augmentation using generative adversarial networks for robust speech recognition;Yanmin Qian et al;《Speech Communication》;20190819;全文 * |
| EXPLORING SPEECH ENHANCEMENT WITH GENERATIVE ADVERSARIAL NETWORKS FOR ROBUST SPEECH RECOGNITION;Chris Donahue et al;《ICASSP 2018》;20180913;全文 * |
| GENERATIVE ADVERSARIAL NETWORKS BASED DATA AUGMENTATION FOR NOISE ROBUST SPEECH RECOGNITION;Hu Hu et al;《ICASSP 2018》;20180913;全文 * |
| ROBUST SPEECH RECOGNITION USING GENERATIVE ADVERSARIAL NETWORKS;Anuroop Sriram et al;《ICASSP 2018》;20180913;全文 * |
| 低数据资源条件下基于优化的数据选择策略的无监督语音识别声学建模;钱彦旻等;《清华大学学报( 自然科学版)》;20130719;全文 * |
| 基于生成对抗网络的语音增强方法研究;王海武;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20180815;全文 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN110706692A (zh) | 2020-01-17 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN110706692B (zh) | 儿童语音识别模型的训练方法及系统 | |
| CN112735373B (zh) | 语音合成方法、装置、设备及存储介质 | |
| CN110600017B (zh) | 语音处理模型的训练方法、语音识别方法、系统及装置 | |
| CN110246487B (zh) | 用于单通道的语音识别模型的优化方法及系统 | |
| CN111081259B (zh) | 基于说话人扩充的语音识别模型训练方法及系统 | |
| US12354590B2 (en) | Audio processing method and apparatus based on artificial intelligence, device, storage medium, and computer program product | |
| CN109637546B (zh) | 知识蒸馏方法和装置 | |
| CN110246488B (zh) | 半优化CycleGAN模型的语音转换方法及装置 | |
| EP1989701B1 (en) | Speaker authentication | |
| CN107871496B (zh) | 语音识别方法和装置 | |
| CN109887484A (zh) | 一种基于对偶学习的语音识别与语音合成方法及装置 | |
| CN111640456B (zh) | 叠音检测方法、装置和设备 | |
| CN114596844A (zh) | 声学模型的训练方法、语音识别方法及相关设备 | |
| CN112837669B (zh) | 语音合成方法、装置及服务器 | |
| CN117711386A (zh) | 语音识别模型的训练、语音识别方法、装置、设备及介质 | |
| KR20210014949A (ko) | 음성 인식을 위한 인공신경망에서의 디코딩 방법 및 장치 | |
| CN112017694B (zh) | 语音数据的评测方法和装置、存储介质和电子装置 | |
| CN114783426B (zh) | 语音识别方法、装置、电子设备和存储介质 | |
| CN116825092B (zh) | 语音识别方法、语音识别模型的训练方法及装置 | |
| CN108417207A (zh) | 一种深度混合生成网络自适应方法及系统 | |
| CN114333790B (zh) | 数据处理方法、装置、设备、存储介质及程序产品 | |
| CN116092485A (zh) | 语音识别模型的训练方法及装置、语音识别方法及装置 | |
| CN114267334A (zh) | 语音识别模型训练方法及语音识别方法 | |
| Takamichi et al. | Sampling-based speech parameter generation using moment-matching networks | |
| CN111653270B (zh) | 语音处理方法、装置、计算机可读存储介质及电子设备 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| TA01 | Transfer of patent application right | ||
| TA01 | Transfer of patent application right |
Effective date of registration: 20200616 Address after: Room 223, old administration building, 800 Dongchuan Road, Minhang District, Shanghai, 200240 Applicant after: Shanghai Jiaotong University Intellectual Property Management Co.,Ltd. Applicant after: AI SPEECH Ltd. Address before: 200240 Dongchuan Road, Shanghai, No. 800, No. Applicant before: SHANGHAI JIAO TONG University Applicant before: AI SPEECH Ltd. |
|
| TA01 | Transfer of patent application right | ||
| TA01 | Transfer of patent application right |
Effective date of registration: 20201020 Address after: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu. Applicant after: AI SPEECH Ltd. Address before: Room 223, old administration building, 800 Dongchuan Road, Minhang District, Shanghai, 200240 Applicant before: Shanghai Jiaotong University Intellectual Property Management Co.,Ltd. Applicant before: AI SPEECH Ltd. |
|
| CB02 | Change of applicant information | ||
| CB02 | Change of applicant information |
Address after: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province Applicant after: Sipic Technology Co.,Ltd. Address before: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province Applicant before: AI SPEECH Ltd. |
|
| GR01 | Patent grant | ||
| GR01 | Patent grant |