[go: up one dir, main page]

GB2626038B - Systems and Methods for Speech Processing - Google Patents

Systems and Methods for Speech Processing

Info

Publication number
GB2626038B
GB2626038B GB2300276.9A GB202300276A GB2626038B GB 2626038 B GB2626038 B GB 2626038B GB 202300276 A GB202300276 A GB 202300276A GB 2626038 B GB2626038 B GB 2626038B
Authority
GB
United Kingdom
Prior art keywords
systems
methods
speech processing
speech
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
GB2300276.9A
Other versions
GB2626038A (en
Inventor
Li Mohan
Sanand Doddipatla Rama
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Original Assignee
Toshiba Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba Corp filed Critical Toshiba Corp
Priority to GB2300276.9A priority Critical patent/GB2626038B/en
Priority to JP2023208671A priority patent/JP7673166B2/en
Publication of GB2626038A publication Critical patent/GB2626038A/en
Application granted granted Critical
Publication of GB2626038B publication Critical patent/GB2626038B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)
  • Storage Device Security (AREA)
GB2300276.9A 2023-01-09 2023-01-09 Systems and Methods for Speech Processing Active GB2626038B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
GB2300276.9A GB2626038B (en) 2023-01-09 2023-01-09 Systems and Methods for Speech Processing
JP2023208671A JP7673166B2 (en) 2023-01-09 2023-12-11 Systems and methods for audio processing - Patents.com

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB2300276.9A GB2626038B (en) 2023-01-09 2023-01-09 Systems and Methods for Speech Processing

Publications (2)

Publication Number Publication Date
GB2626038A GB2626038A (en) 2024-07-10
GB2626038B true GB2626038B (en) 2025-09-24

Family

ID=91472160

Family Applications (1)

Application Number Title Priority Date Filing Date
GB2300276.9A Active GB2626038B (en) 2023-01-09 2023-01-09 Systems and Methods for Speech Processing

Country Status (2)

Country Link
JP (1) JP7673166B2 (en)
GB (1) GB2626038B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113593574A (en) * 2021-08-25 2021-11-02 广州虎牙科技有限公司 Voice recognition method, computer program product and electronic equipment

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0252400A (en) * 1988-08-15 1990-02-21 Oki Electric Ind Co Ltd Speech recognition device
JP5688677B2 (en) * 2010-10-04 2015-03-25 日本電気株式会社 Voice input support device
JP7210938B2 (en) * 2018-08-29 2023-01-24 富士通株式会社 Text generation device, text generation program and text generation method
KR102868990B1 (en) * 2018-11-14 2025-10-10 삼성전자주식회사 A decoding method in an artificial neural network and an apparatus thereof
CN118865957A (en) * 2019-07-09 2024-10-29 谷歌有限责任公司 On-device speech synthesis of text snippets for training on-device speech recognition models
JP2021039216A (en) * 2019-09-02 2021-03-11 日本電信電話株式会社 Speech recognition device, speech recognition method and speech recognition program
US11521595B2 (en) * 2020-05-01 2022-12-06 Google Llc End-to-end multi-talker overlapping speech recognition
CN116848579A (en) * 2020-10-20 2023-10-03 谷歌有限责任公司 Fast transmitting low-delay stream ASR with sequence-level transmitting regularization
US20220261631A1 (en) * 2021-02-12 2022-08-18 Nvidia Corporation Pipelines for efficient training and deployment of machine learning models

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113593574A (en) * 2021-08-25 2021-11-02 广州虎牙科技有限公司 Voice recognition method, computer program product and electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Higuchi et al, 2020. "Mask CTC: Non-Autoregressive End-to-End ASR with CTC and Mask Predict". Available at https://arxiv.org/pdf/2005.08700.pdf [Accessed 10 July 2023] *

Also Published As

Publication number Publication date
JP2024098143A (en) 2024-07-22
JP7673166B2 (en) 2025-05-08
GB2626038A (en) 2024-07-10

Similar Documents

Publication Publication Date Title
EP4336490A4 (en) Voice processing method and related device
AU2021358888A9 (en) Systems and methods for processing
GB2600987B (en) Speech Recognition Systems and Methods
GB2613581B (en) Systems and methods for speech recognition
GB2613690B (en) Systems and methods for order processing
GB2598563B (en) System and method for speech processing
EP4254408A4 (en) Speech processing method and apparatus, and apparatus for processing speech
GB2602976B (en) Speech recognition systems and methods
EP4155929A4 (en) Processing system and processing method
GB202015911D0 (en) Systems and methods for processing a sample
EP4214634A4 (en) Systems and methods for object recognition
GB202020476D0 (en) Processing system and method
GB2626038B (en) Systems and Methods for Speech Processing
GB2623542B (en) Audio cancellation system and method
GB2612079B (en) Voice command processing method and apparatus
GB2607950B (en) Audio cancellation system and method
GB2607947B (en) Audio cancellation system and method
GB2582572B (en) A speech processing system and method
GB2607992B (en) Speech processing method and apparatus
GB202114725D0 (en) Systems and methods for order processing
GB2619071B (en) Secure processing system and method
GB202305509D0 (en) Audio processing system and method
GB202208444D0 (en) Audio processing system and method
GB202114723D0 (en) System and methods for order processing
GB202017819D0 (en) Audio processing system and method