An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
-
Updated
Apr 12, 2024 - Python
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
An official implementation for "X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval"
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
A PyTorch implementation of state of the art video captioning models from 2015-2019 on MSVD and MSRVTT datasets.
Source code for Semantics-Assisted Video Captioning Model Trained with Scheduled Sampling Strategy
Source Code for Captionomaly: A Deep Learning Toolbox for Anomaly Captioning in Surveillance Videos
Add a description, image, and links to the msrvtt topic page so that developers can more easily learn about it.
To associate your repository with the msrvtt topic, visit your repo's landing page and select "manage topics."