[go: up one dir, main page]

WO2025029631A3 - Spoken language control system - Google Patents

Spoken language control system Download PDF

Info

Publication number
WO2025029631A3
WO2025029631A3 PCT/US2024/039755 US2024039755W WO2025029631A3 WO 2025029631 A3 WO2025029631 A3 WO 2025029631A3 US 2024039755 W US2024039755 W US 2024039755W WO 2025029631 A3 WO2025029631 A3 WO 2025029631A3
Authority
WO
WIPO (PCT)
Prior art keywords
control system
utterance
input
domain
spoken language
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2024/039755
Other languages
French (fr)
Other versions
WO2025029631A2 (en
Inventor
Milutin PAJOVIC
Cristobal Alessandri
Simon CHARLOW
Nicholas MORAN
Jonathan Samuel YEDIDIA
Sean DEYO
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Analog Devices Inc
Original Assignee
Analog Devices Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Analog Devices Inc filed Critical Analog Devices Inc
Publication of WO2025029631A2 publication Critical patent/WO2025029631A2/en
Publication of WO2025029631A3 publication Critical patent/WO2025029631A3/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/226Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
    • G10L2015/228Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)

Abstract

System and techniques for processing spoken language to use as input to a control system are described herein. After an utterance is obtained from a user in a general language, a generative neural network model is invoked on the utterance to transform the utterance into a phrase that conforms to a domain-specific language. The domain specific language phrase is provided a control system that accepts phrases of the domain-specific language as input and controls a device based on the input.
PCT/US2024/039755 2023-07-28 2024-07-26 Spoken language control system Pending WO2025029631A2 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202363516404P 2023-07-28 2023-07-28
US63/516,404 2023-07-28
US202363531167P 2023-08-07 2023-08-07
US63/531,167 2023-08-07

Publications (2)

Publication Number Publication Date
WO2025029631A2 WO2025029631A2 (en) 2025-02-06
WO2025029631A3 true WO2025029631A3 (en) 2025-04-17

Family

ID=94395859

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2024/039755 Pending WO2025029631A2 (en) 2023-07-28 2024-07-26 Spoken language control system

Country Status (1)

Country Link
WO (1) WO2025029631A2 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180329993A1 (en) * 2017-05-11 2018-11-15 Commvault Systems, Inc. Natural language processing integrated with database and data storage management
US11361764B1 (en) * 2019-01-03 2022-06-14 Amazon Technologies, Inc. Device naming-indicator generation
US20230215441A1 (en) * 2020-06-04 2023-07-06 Microsoft Technology Licensing, Llc Providing prompts in speech recognition results in real time

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180329993A1 (en) * 2017-05-11 2018-11-15 Commvault Systems, Inc. Natural language processing integrated with database and data storage management
US11361764B1 (en) * 2019-01-03 2022-06-14 Amazon Technologies, Inc. Device naming-indicator generation
US20230215441A1 (en) * 2020-06-04 2023-07-06 Microsoft Technology Licensing, Llc Providing prompts in speech recognition results in real time

Also Published As

Publication number Publication date
WO2025029631A2 (en) 2025-02-06

Similar Documents

Publication Publication Date Title
EP3958256A3 (en) Voice processing method, apparatus, device and storage medium for vehicle-mounted device
EP3859735A3 (en) Voice conversion method, voice conversion apparatus, electronic device, and storage medium
WO2022043675A3 (en) A computer implemented method for the automated analysis or use of data
CN108257616A (en) Interactive detection method and device
WO2022046781A8 (en) Reference-fee foreign accent conversion system and method
EP4576069A3 (en) Full-duplex utterance processing in a natural language virtual assistant
ATE286296T1 (en) NETWORK INTERACTIVE USER INTERFACE USING LANGUAGE RECOGNITION AND NATURAL LANGUAGE PROCESSING
IL157454A0 (en) Natural language query system for accessing an information system
US20230298616A1 (en) System and Method For Identifying Sentiment (Emotions) In A Speech Audio Input with Haptic Output
Barua et al. Neural network based recognition of speech using MFCC features
CN116933806A (en) Concurrent translation system and concurrent translation terminal
Das et al. HLT-NUS submission for 2020 NIST conversational telephone speech SRE
Alibegović et al. Speech recognition system for a service robot-a performance evaluation
Bawa et al. Noise robust in-domain children speech enhancement for automatic Punjabi recognition system under mismatched conditions
Bachate et al. Automatic speech recognition systems for regional languages in India
Wang et al. Learning explicit prosody models and deep speaker embeddings for atypical voice conversion
GB2641461A (en) System and method for artificial intelligence-based language skill assessment and development
Lai et al. Mel-Scale Frequency Extraction and Classification of Dialect-Speech Signals with 1D CNN based Classifier for Gender and Region Recognition
ATE405920T1 (en) GENERATING A LANGUAGE RECOGNITION GRAMMAR FOR ALPHANUMERIC EXPRESSIONS
Sharma et al. ASR—A real-time speech recognition on portable devices
CN110782895A (en) Man-machine voice system based on artificial intelligence
WO2025029631A3 (en) Spoken language control system
Hegde et al. Continuous speech recognition using joint features derived from the modified group delay function and MFCC.
Xinyuan et al. Non-parallel many-to-many voice conversion by knowledge transfer from a text-to-speech model
Kimura et al. End-to-End Deep Learning Speech Recognition Model for Silent Speech Challenge.

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24849886

Country of ref document: EP

Kind code of ref document: A2