[go: up one dir, main page]

Waibel et al., 2003 - Google Patents

SMaRT: The smart meeting room task at ISL

Waibel et al., 2003

View PDF
Document ID
11368825063672847323
Author
Waibel A
Schultz T
Bett M
Denecke M
Malkin R
Rogina I
Stiefelhagen R
Yang J
Publication year
Publication venue
2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP'03).

External Links

Snippet

As computational and communications systems become increasingly smaller, faster, more powerful, and more integrated, the goal of interactive, integrated meeting support rooms is slowly becoming reality. It is already possible, for instance, to rapidly locate task-related …
Continue reading at www.academia.edu (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • G10L15/265Speech recognisers specially adapted for particular applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/40Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition

Similar Documents

Publication Publication Date Title
Waibel et al. SMaRT: The smart meeting room task at ISL
CN112075075B (en) Method and computerized intelligent assistant for facilitating teleconferencing
Morency et al. Contextual recognition of head gestures
CN108227932B (en) Interaction intent determination method and apparatus, computer equipment and storage medium
US10896688B2 (en) Real-time conversation analysis system
KR100580619B1 (en) Method and device for managing conversation between user and agent
US8407049B2 (en) Systems and methods for conversation enhancement
KR100586767B1 (en) System and method for multimode focus detection, reference ambiguity resolution and mood classification using multimode input
Kafle et al. Predicting the understandability of imperfect english captions for people who are deaf or hard of hearing
EP3776171A1 (en) Non-disruptive nui command
CN113129867B (en) Speech recognition model training method, speech recognition method, device and equipment
McCowan et al. Towards computer understanding of human interactions
Nakano et al. Implementation and evaluation of a multimodal addressee identification mechanism for multiparty conversation systems
Cabañas-Molero et al. Multimodal speaker diarization for meetings using volume-evaluated SRP-PHAT and video analysis
JP7230803B2 (en) Information processing device and information processing method
Strauß et al. Wizard-of-Oz Data Collection for Perception and Interaction in Multi-User Environments.
Roy et al. Wearable audio computing: A survey of interaction techniques
Zhang et al. Videoconference interpreting goes multimodal: Some insights and a tentative proposal
Hirayama et al. Info-concierge: Proactive multi-modal interaction through mind probing
Metze et al. The “FAME” interactive space
Kawahara Smart posterboard: Multi-modal sensing and analysis of poster conversations
CN113707130B (en) Voice recognition method and device for voice recognition
Minker et al. Next-generation human-computer interfaces-Towards intelligent, adaptive and proactive spoken language dialogue systems
EP3910626A1 (en) Presentation control
Vildjiounaite et al. Requirements and software framework for adaptive multimodal affect recognition