WO2025029631A3 - Spoken language control system - Google Patents
Spoken language control system Download PDFInfo
- Publication number
- WO2025029631A3 WO2025029631A3 PCT/US2024/039755 US2024039755W WO2025029631A3 WO 2025029631 A3 WO2025029631 A3 WO 2025029631A3 US 2024039755 W US2024039755 W US 2024039755W WO 2025029631 A3 WO2025029631 A3 WO 2025029631A3
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- control system
- utterance
- input
- domain
- spoken language
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/226—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
- G10L2015/228—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
Abstract
System and techniques for processing spoken language to use as input to a control system are described herein. After an utterance is obtained from a user in a general language, a generative neural network model is invoked on the utterance to transform the utterance into a phrase that conforms to a domain-specific language. The domain specific language phrase is provided a control system that accepts phrases of the domain-specific language as input and controls a device based on the input.
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202363516404P | 2023-07-28 | 2023-07-28 | |
| US63/516,404 | 2023-07-28 | ||
| US202363531167P | 2023-08-07 | 2023-08-07 | |
| US63/531,167 | 2023-08-07 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| WO2025029631A2 WO2025029631A2 (en) | 2025-02-06 |
| WO2025029631A3 true WO2025029631A3 (en) | 2025-04-17 |
Family
ID=94395859
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2024/039755 Pending WO2025029631A2 (en) | 2023-07-28 | 2024-07-26 | Spoken language control system |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025029631A2 (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20180329993A1 (en) * | 2017-05-11 | 2018-11-15 | Commvault Systems, Inc. | Natural language processing integrated with database and data storage management |
| US11361764B1 (en) * | 2019-01-03 | 2022-06-14 | Amazon Technologies, Inc. | Device naming-indicator generation |
| US20230215441A1 (en) * | 2020-06-04 | 2023-07-06 | Microsoft Technology Licensing, Llc | Providing prompts in speech recognition results in real time |
-
2024
- 2024-07-26 WO PCT/US2024/039755 patent/WO2025029631A2/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20180329993A1 (en) * | 2017-05-11 | 2018-11-15 | Commvault Systems, Inc. | Natural language processing integrated with database and data storage management |
| US11361764B1 (en) * | 2019-01-03 | 2022-06-14 | Amazon Technologies, Inc. | Device naming-indicator generation |
| US20230215441A1 (en) * | 2020-06-04 | 2023-07-06 | Microsoft Technology Licensing, Llc | Providing prompts in speech recognition results in real time |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2025029631A2 (en) | 2025-02-06 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP3958256A3 (en) | Voice processing method, apparatus, device and storage medium for vehicle-mounted device | |
| EP3859735A3 (en) | Voice conversion method, voice conversion apparatus, electronic device, and storage medium | |
| WO2022043675A3 (en) | A computer implemented method for the automated analysis or use of data | |
| CN108257616A (en) | Interactive detection method and device | |
| WO2022046781A8 (en) | Reference-fee foreign accent conversion system and method | |
| EP4576069A3 (en) | Full-duplex utterance processing in a natural language virtual assistant | |
| ATE286296T1 (en) | NETWORK INTERACTIVE USER INTERFACE USING LANGUAGE RECOGNITION AND NATURAL LANGUAGE PROCESSING | |
| IL157454A0 (en) | Natural language query system for accessing an information system | |
| US20230298616A1 (en) | System and Method For Identifying Sentiment (Emotions) In A Speech Audio Input with Haptic Output | |
| Barua et al. | Neural network based recognition of speech using MFCC features | |
| CN116933806A (en) | Concurrent translation system and concurrent translation terminal | |
| Das et al. | HLT-NUS submission for 2020 NIST conversational telephone speech SRE | |
| Alibegović et al. | Speech recognition system for a service robot-a performance evaluation | |
| Bawa et al. | Noise robust in-domain children speech enhancement for automatic Punjabi recognition system under mismatched conditions | |
| Bachate et al. | Automatic speech recognition systems for regional languages in India | |
| Wang et al. | Learning explicit prosody models and deep speaker embeddings for atypical voice conversion | |
| GB2641461A (en) | System and method for artificial intelligence-based language skill assessment and development | |
| Lai et al. | Mel-Scale Frequency Extraction and Classification of Dialect-Speech Signals with 1D CNN based Classifier for Gender and Region Recognition | |
| ATE405920T1 (en) | GENERATING A LANGUAGE RECOGNITION GRAMMAR FOR ALPHANUMERIC EXPRESSIONS | |
| Sharma et al. | ASR—A real-time speech recognition on portable devices | |
| CN110782895A (en) | Man-machine voice system based on artificial intelligence | |
| WO2025029631A3 (en) | Spoken language control system | |
| Hegde et al. | Continuous speech recognition using joint features derived from the modified group delay function and MFCC. | |
| Xinyuan et al. | Non-parallel many-to-many voice conversion by knowledge transfer from a text-to-speech model | |
| Kimura et al. | End-to-End Deep Learning Speech Recognition Model for Silent Speech Challenge. |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24849886 Country of ref document: EP Kind code of ref document: A2 |