Waibel et al., 2003 - Google Patents
SMaRT: The smart meeting room task at ISLWaibel et al., 2003
View PDF- Document ID
- 11368825063672847323
- Author
- Waibel A
- Schultz T
- Bett M
- Denecke M
- Malkin R
- Rogina I
- Stiefelhagen R
- Yang J
- Publication year
- Publication venue
- 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP'03).
External Links
Snippet
As computational and communications systems become increasingly smaller, faster, more powerful, and more integrated, the goal of interactive, integrated meeting support rooms is slowly becoming reality. It is already possible, for instance, to rapidly locate task-related …
- 230000003993 interaction 0 abstract description 17
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
- G10L15/265—Speech recognisers specially adapted for particular applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2201/00—Electronic components, circuits, software, systems or apparatus used in telephone systems
- H04M2201/40—Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Waibel et al. | SMaRT: The smart meeting room task at ISL | |
| CN112075075B (en) | Method and computerized intelligent assistant for facilitating teleconferencing | |
| Morency et al. | Contextual recognition of head gestures | |
| CN108227932B (en) | Interaction intent determination method and apparatus, computer equipment and storage medium | |
| US10896688B2 (en) | Real-time conversation analysis system | |
| KR100580619B1 (en) | Method and device for managing conversation between user and agent | |
| US8407049B2 (en) | Systems and methods for conversation enhancement | |
| KR100586767B1 (en) | System and method for multimode focus detection, reference ambiguity resolution and mood classification using multimode input | |
| Kafle et al. | Predicting the understandability of imperfect english captions for people who are deaf or hard of hearing | |
| EP3776171A1 (en) | Non-disruptive nui command | |
| CN113129867B (en) | Speech recognition model training method, speech recognition method, device and equipment | |
| McCowan et al. | Towards computer understanding of human interactions | |
| Nakano et al. | Implementation and evaluation of a multimodal addressee identification mechanism for multiparty conversation systems | |
| Cabañas-Molero et al. | Multimodal speaker diarization for meetings using volume-evaluated SRP-PHAT and video analysis | |
| JP7230803B2 (en) | Information processing device and information processing method | |
| Strauß et al. | Wizard-of-Oz Data Collection for Perception and Interaction in Multi-User Environments. | |
| Roy et al. | Wearable audio computing: A survey of interaction techniques | |
| Zhang et al. | Videoconference interpreting goes multimodal: Some insights and a tentative proposal | |
| Hirayama et al. | Info-concierge: Proactive multi-modal interaction through mind probing | |
| Metze et al. | The “FAME” interactive space | |
| Kawahara | Smart posterboard: Multi-modal sensing and analysis of poster conversations | |
| CN113707130B (en) | Voice recognition method and device for voice recognition | |
| Minker et al. | Next-generation human-computer interfaces-Towards intelligent, adaptive and proactive spoken language dialogue systems | |
| EP3910626A1 (en) | Presentation control | |
| Vildjiounaite et al. | Requirements and software framework for adaptive multimodal affect recognition |