MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
-
Updated
Oct 22, 2024 - Python
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
Explore LLM model deployment based on AXera's AI chips
PicQ: Demo for MiniCPM-V 2.6 to answer questions about images using natural language.
軽量VLMのMiniCPM-V2.6のColaboratoryサンプル
VidiQA: Demo for MiniCPM-V 2.6 to answer questions about videos using natural language.
Add a description, image, and links to the minicpm-v topic page so that developers can more easily learn about it.
To associate your repository with the minicpm-v topic, visit your repo's landing page and select "manage topics."