Starred repositories
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
An open source implementation of CLIP.
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Chinese safety prompts for evaluating and improving the safety of LLMs. 中文安全prompts,用于评估和提升大模型的安全性。
Official implementation of "Segment Any Anomaly without Training via Hybrid Prompt Regularization (SAA+)".
Unofficial implementation of Google "CutPaste: Self-Supervised Learning for Anomaly Detection and Localization" in PyTorch
Using modified BiSeNet for face parsing in PyTorch
Real time interactive streaming digital human
Fay is an open-source digital human framework integrating language models and digital characters. It offers retail, assistant, and agent versions for diverse applications like virtual shopping guid…
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo…
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
Point-SAM: This is the official repository of "Point-SAM: Promptable 3D Segmentation Model for Point Clouds". We provide codes for running our demo and links to download checkpoints.
Aligning pretrained language models with instruction data generated by themselves.
Kanchil(鼷鹿)是世界上最小的偶蹄目动物,这个开源项目意在探索小模型(6B以下)是否也能具备和人类偏好对齐的能力。
High-Resolution Image Synthesis with Latent Diffusion Models
A Trimap-Free Portrait Matting Solution in Real Time [AAAI 2022]
Segment Anything in High Quality [NeurIPS 2023]
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁,一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️、Vue 生态搭建前端🍍、FastAPI 搭…
Visual self-questioning for large vision-language assistant.
Code for Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction (ACL24))
Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also …
Run PyTorch LLMs locally on servers, desktop and mobile