Stars
Stable Diffusion web UI
A Python package for segmenting geospatial data with the Segment Anything Model (SAM)
My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"
Segment Anything with Deictic Prompting
This is the official PyTorch implementation of the paper
Official code of "EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model"
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
Implementation of ViTaR: ViTAR: Vision Transformer with Any Resolution in PyTorch
RS5M: a large-scale vision language dataset for remote sensing [TGRS]
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
WebUI extension for ControlNet
Adapting Meta AI's Segment Anything to Downstream Tasks with Adapters and Prompts
[ECCV 2024] The official code of paper "Open-Vocabulary SAM".
[TGRS 2024] MSNet: Self-Supervised Multiscale Network with Enhanced Separation Training for Hyperspectral Anomaly Detection.
Pipeline to collect dataset for GRAFT
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
a state-of-the-art-level open visual language model | 多模态预训练模型
Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复
Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation
Make your models invariant to changes in scale.
[ICCV2023] Segment Every Reference Object in Spatial and Temporal Spaces
[ICLR 2022] "Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice" by Peihao Wang, Wenqing Zheng, Tianlong Chen, Zhangyang Wang
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Official PyTorch Implementation of Global Rectification and Decoupled Registration for Few-Shot Segmentation in Remote Sensing Imagery (TGRS'23).
The implementation of the technical report: "Customized Segment Anything Model for Medical Image Segmentation"
A codebase for flexible and efficient Image Text Representation Alignment
Reorder-based post-training quantization for large language model