Starred repositories
EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation
Janus-Series: Unified Multimodal Understanding and Generation Models
World Model based Autonomous Driving Platform in CARLA 🚗
Official code implementation for the paper "X-Drive: Cross-modality Consistent Multi-Sensor Data Synthesis for Driving Scenarios"
[NeurIPS 2024] Data exporter for SS3DM: Benchmarking Street-View Surface Reconstruction with a Synthetic 3D Mesh Dataset
AI-powered ab initio biomolecular dynamics simulation
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
Offical codes for "AutoVFX: Physically Realistic Video Editing from Natural Language Instructions."
mllm-npu: training multimodal large language models on Ascend NPUs
Ongoing research training transformer models at scale
[WACV2025] Linking Omni-Depth with View Synthesis through Multi-Sphere Image aided Generalizable Neural Radiance Field
Allegro is a powerful text-to-video model that generates high-quality videos up to 6 seconds at 15 FPS and 720p resolution from simple text input.
Enjoy the magic of Diffusion models!
[ICCV 2023 Oral] Game-theoretic modeling and learning of Transformer-based interactive prediction and planning
Official implementation of “LucidFusion: Generating 3D Gaussians with Arbitrary Unposed Images”
[ECCV 2024] This is the official implementation of HRMapNet, maintaining and utilizing a low-cost global rasterized map to enhance online vectorized map perception.
Simple project page template for your research paper, built with Astro and Tailwind CSS
CogVideoX-LoRAs is a centralized repository for all LoRA models created for CogVideoX, filling the gap for a unified sharing space. With the rising demand for customized video generation, this hub …
Memory optimized finetuning scripts for CogVideoX using TorchAO and DeepSpeed
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
[IEEE TIV] OccFusion: Multi-Sensor Fusion Framework for 3D Semantic Occupancy Prediction
Official Implementation of paper "MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion"
NeuroNCAP benchmark for end-to-end autonomous driving
Official code for "ControlAR: Controllable Image Generation with Autoregressive Models"
🌊 Images to → 2.5D Parallax Effect Video. A Free and Open Source ImmersityAI alternative
[CVPR2024] Official Repository of Paper "Panacea: Panoramic and Controllable Video Generation for Autonomous Driving"