[go: up one dir, main page]

Follow
Peng Gao
Peng Gao
Z-Image Team, Alibaba Group
Verified email at alibaba-inc.com - Homepage
Title
Cited by
Cited by
Year
Clip-adapter: Better vision-language models with feature adapters
P Gao, S Geng, R Zhang, T Ma, R Fang, Y Zhang, H Li, Y Qiao
International Journal of Computer Vision 132 (2), 581-595, 2024
17092024
Llama-adapter: Efficient fine-tuning of language models with zero-init attention
R Zhang, J Han, C Liu, P Gao, A Zhou, X Hu, S Yan, P Lu, H Li, Y Qiao
arXiv preprint arXiv:2303.16199, 2023
1286*2023
Tip-adapter: Training-free clip-adapter for better vision-language modeling
R Zhang, R Fang, W Zhang, P Gao, K Li, J Dai, Y Qiao, H Li
arXiv preprint arXiv:2111.03930, 2021
1242*2021
Uniformer: Unified transformer for efficient spatiotemporal representation learning
K Li, Y Wang, P Gao, G Song, Y Liu, H Li, Y Qiao
arXiv preprint arXiv:2201.04676, 2022
1116*2022
Llama-adapter v2: Parameter-efficient visual instruction model
P Gao, J Han, R Zhang, Z Lin, S Geng, A Zhou, W Zhang, P Lu, C He, ...
arXiv preprint arXiv:2304.15010, 2023
7752023
Pointclip: Point cloud understanding by clip
R Zhang, Z Guo, W Zhang, K Li, X Miao, B Cui, Y Qiao, P Gao, H Li
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022
6952022
Dynamic fusion with intra-and inter-modality attention flow for visual question answering
P Gao, Z Jiang, H You, P Lu, SCH Hoi, X Wang, H Li
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2019
525*2019
Mathverse: Does your multi-modal llm truly see the diagrams in visual math problems?
R Zhang, D Jiang, Y Zhang, H Lin, Z Guo, P Qiu, A Zhou, P Lu, KW Chang, ...
European Conference on Computer Vision, 169-186, 2024
4792024
Omniquant: Omnidirectionally calibrated quantization for large language models
W Shao, M Chen, Z Zhang, P Xu, L Zhao, Z Li, K Zhang, P Gao, Y Qiao, ...
arXiv preprint arXiv:2308.13137, 2023
4612023
Fast convergence of detr with spatially modulated co-attention
P Gao, M Zheng, X Wang, J Dai, H Li
Proceedings of the IEEE/CVF international conference on computer vision …, 2021
4422021
Point-m2ae: multi-scale masked autoencoders for hierarchical point cloud pre-training
R Zhang, Z Guo, P Gao, R Fang, B Zhao, D Wang, Y Qiao, H Li
Advances in neural information processing systems 35, 27061-27074, 2022
4112022
Personalize segment anything model with one shot
R Zhang, Z Jiang, Z Guo, S Yan, J Pan, X Ma, H Dong, P Gao, H Li
arXiv preprint arXiv:2305.03048, 2023
3432023
Sphinx: The joint mixing of weights, tasks, and visual embeddings for multi-modal large language models
Z Lin, C Liu, R Zhang, P Gao, L Qiu, H Xiao, H Qiu, C Lin, W Shao, ...
arXiv preprint arXiv:2311.07575, 2023
3392023
Frozen clip models are efficient video learners
Z Lin, S Geng, R Zhang, P Gao, G De Melo, X Wang, J Dai, Y Qiao, H Li
European Conference on Computer Vision, 388-404, 2022
3262022
Lvlm-ehub: A comprehensive evaluation benchmark for large vision-language models
P Xu, W Shao, K Zhang, P Gao, S Liu, M Lei, F Meng, S Huang, Y Qiao, ...
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024
3242024
You only need 90k parameters to adapt light: a light weight transformer for image enhancement and exposure correction
Z Cui, K Li, L Gu, S Su, P Gao, Z Jiang, Y Qiao, T Harada
arXiv preprint arXiv:2205.14871, 2022
3032022
Prompt, generate, then cache: Cascade of foundation models makes strong few-shot learners
R Zhang, X Hu, B Li, S Huang, H Deng, Y Qiao, P Gao, H Li
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2023
2802023
End-to-end object detection with adaptive clustering transformer
M Zheng, P Gao, R Zhang, K Li, X Wang, H Li, H Dong
arXiv preprint arXiv:2011.09315, 2020
2762020
Pointclip v2: Prompting clip and gpt for powerful 3d open-world learning
X Zhu, R Zhang, B He, Z Guo, Z Zeng, Z Qin, S Zhang, P Gao
Proceedings of the IEEE/CVF international conference on computer vision …, 2023
2672023
MonoDETR: Depth-guided transformer for monocular 3D object detection
R Zhang, H Qiu, T Wang, Z Guo, Z Cui, Y Qiao, H Li, P Gao
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
2662023
The system can't perform the operation now. Try again later.
Articles 1–20