Peng Gao

Cited by

	All	Since 2021
Citations	18045	17851
h-index	57	55
i10-index	115	115

9000

4500

2250

6750

2020202120222023202420252026136 296 654 2250 5984 8354 217

Public access

View all

54 articles

2 articles

available

not available

Based on funding mandates

Co-authors

Hongsheng Li (李鸿升)The Chinese University of Hong KongVerified email at ee.cuhk.edu.hk
Renrui ZhangSeed & MMLab & PKUVerified email at pku.edu.cn
Yu QiaoProfessor of Shanghai AI Laboratory; Shenzhen Institutes of Advanced Technology, CASVerified email at siat.ac.cn
Dongyang LiuMMLab CUHKVerified email at link.cuhk.edu.hk
Jiaming HanPhD Student, CUHK MMLabVerified email at link.cuhk.edu.hk
Zhen LiAlibaba GroupVerified email at mail.nankai.edu.cn
Jiasen LuResearch Scientist, AppleVerified email at apple.com

Peng Gao

Z-Image Team, Alibaba Group

Verified email at alibaba-inc.com - Homepage

Image/Video Generation LLMs VLMs


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Clip-adapter: Better vision-language models with feature adapters P Gao, S Geng, R Zhang, T Ma, R Fang, Y Zhang, H Li, Y Qiao International Journal of Computer Vision 132 (2), 581-595, 2024	1709	2024
Llama-adapter: Efficient fine-tuning of language models with zero-init attention R Zhang, J Han, C Liu, P Gao, A Zhou, X Hu, S Yan, P Lu, H Li, Y Qiao arXiv preprint arXiv:2303.16199, 2023	1286*	2023
Tip-adapter: Training-free clip-adapter for better vision-language modeling R Zhang, R Fang, W Zhang, P Gao, K Li, J Dai, Y Qiao, H Li arXiv preprint arXiv:2111.03930, 2021	1242*	2021
Uniformer: Unified transformer for efficient spatiotemporal representation learning K Li, Y Wang, P Gao, G Song, Y Liu, H Li, Y Qiao arXiv preprint arXiv:2201.04676, 2022	1116*	2022
Llama-adapter v2: Parameter-efficient visual instruction model P Gao, J Han, R Zhang, Z Lin, S Geng, A Zhou, W Zhang, P Lu, C He, ... arXiv preprint arXiv:2304.15010, 2023	775	2023
Pointclip: Point cloud understanding by clip R Zhang, Z Guo, W Zhang, K Li, X Miao, B Cui, Y Qiao, P Gao, H Li Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022	695	2022
Dynamic fusion with intra-and inter-modality attention flow for visual question answering P Gao, Z Jiang, H You, P Lu, SCH Hoi, X Wang, H Li Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2019	525*	2019
Mathverse: Does your multi-modal llm truly see the diagrams in visual math problems? R Zhang, D Jiang, Y Zhang, H Lin, Z Guo, P Qiu, A Zhou, P Lu, KW Chang, ... European Conference on Computer Vision, 169-186, 2024	479	2024
Omniquant: Omnidirectionally calibrated quantization for large language models W Shao, M Chen, Z Zhang, P Xu, L Zhao, Z Li, K Zhang, P Gao, Y Qiao, ... arXiv preprint arXiv:2308.13137, 2023	461	2023
Fast convergence of detr with spatially modulated co-attention P Gao, M Zheng, X Wang, J Dai, H Li Proceedings of the IEEE/CVF international conference on computer vision …, 2021	442	2021
Point-m2ae: multi-scale masked autoencoders for hierarchical point cloud pre-training R Zhang, Z Guo, P Gao, R Fang, B Zhao, D Wang, Y Qiao, H Li Advances in neural information processing systems 35, 27061-27074, 2022	411	2022
Personalize segment anything model with one shot R Zhang, Z Jiang, Z Guo, S Yan, J Pan, X Ma, H Dong, P Gao, H Li arXiv preprint arXiv:2305.03048, 2023	343	2023
Sphinx: The joint mixing of weights, tasks, and visual embeddings for multi-modal large language models Z Lin, C Liu, R Zhang, P Gao, L Qiu, H Xiao, H Qiu, C Lin, W Shao, ... arXiv preprint arXiv:2311.07575, 2023	339	2023
Frozen clip models are efficient video learners Z Lin, S Geng, R Zhang, P Gao, G De Melo, X Wang, J Dai, Y Qiao, H Li European Conference on Computer Vision, 388-404, 2022	326	2022
Lvlm-ehub: A comprehensive evaluation benchmark for large vision-language models P Xu, W Shao, K Zhang, P Gao, S Liu, M Lei, F Meng, S Huang, Y Qiao, ... IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024	324	2024
You only need 90k parameters to adapt light: a light weight transformer for image enhancement and exposure correction Z Cui, K Li, L Gu, S Su, P Gao, Z Jiang, Y Qiao, T Harada arXiv preprint arXiv:2205.14871, 2022	303	2022
Prompt, generate, then cache: Cascade of foundation models makes strong few-shot learners R Zhang, X Hu, B Li, S Huang, H Deng, Y Qiao, P Gao, H Li Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2023	280	2023
End-to-end object detection with adaptive clustering transformer M Zheng, P Gao, R Zhang, K Li, X Wang, H Li, H Dong arXiv preprint arXiv:2011.09315, 2020	276	2020
Pointclip v2: Prompting clip and gpt for powerful 3d open-world learning X Zhu, R Zhang, B He, Z Guo, Z Zeng, Z Qin, S Zhang, P Gao Proceedings of the IEEE/CVF international conference on computer vision …, 2023	267	2023
MonoDETR: Depth-guided transformer for monocular 3D object detection R Zhang, H Qiu, T Wang, Z Guo, Z Cui, Y Qiao, H Li, P Gao Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023	266	2023

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors