Lists (12)
Sort Name ascending (A-Z)
Stars
The official implementation of Achieving Cross Modal Generalization with Multimodal Unified Representation (NeurIPS '23)
[CVPR 2024] ViT-Lens: Towards Omni-modal Representations
💻 🤖 A summary on our attempts at using Deep Learning approaches for Emotional Text to Speech 🔈
Code for ALBEF: a new vision-language pre-training method
[ICLR 23 oral] The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation
A curated list of Multimodal Related Research.
VCED 可以通过你的文字描述来自动识别视频中相符合的片段进行视频剪辑。该项目基于跨模态搜索与向量检索技术搭建,通过前后端分离的模式,帮助你快速的接触新一代搜索技术。
A GPT-4 AI Tutor Prompt for customizable personalized learning experiences.
Awesome List of Attention Modules and Plug&Play Modules in Computer Vision
📜 A Novel Facial Emotion Recognition Model Using Segmentation VGG-19 Architecture
MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation
Deployed a facial emotion recognition using neural network model which predicts the emotion from faces in images, videos and live feed from webcam.
Efficient face emotion recognition in photos and videos
Real-Time facial emotion recognition in Python
Building an efficient music recommendation system which determines the emotion of user using Facial Recognition techniques.
Computer Vision module for detecting emotion, age and gender of a person in any given image, video or real time webcam. A custom VGG16 model was developed and trained on open source facial datasets…
[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning
Real-time face detection and emotion/gender classification using fer2013/imdb datasets with a keras CNN model and openCV.
A real time Multimodal Emotion Recognition web app for text, sound and video inputs
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.
Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
课堂专注度及考试作弊系统、课堂动态点名。情绪识别、表情识别、姿态识别和人脸识别结合
About Code release for "FECAM: Frequency Enhanced Channel Attention Mechanism for Time Series Forecasting" ⌚
全网最全Stable Diffusion全套教程,从入门到进阶,耗时三个月制作