Xiaohan Wang

Cited by

	All	Since 2021
Citations	3121	3097
h-index	30	30
i10-index	43	43

1600

800

400

1200

202020212022202320242025202617 89 192 422 762 1585 41

Public access

View all

17 articles

2 articles

available

not available

Based on funding mandates

Xiaohan Wang

Stanford University

Verified email at stanford.edu - Homepage

Computer Vision Video Understanding Large Multimodal Models


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Humanity's last exam L Phan, A Gatti, Z Han, N Li, J Hu, H Zhang, CBC Zhang, M Shaaban, ... arXiv preprint arXiv:2501.14249, 2025	301	2025
T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval X Wang, L Zhu, Y Yang IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021	262	2021
VideoAgent: Long-form Video Understanding with Large Language Model as Agent X Wang, Y Zhang, O Zohar, S Yeung-Levy ECCV 2024, 2024	230	2024
CenterCLIP: Token Clustering for Efficient Text-Video Retrieval S Zhao, L Zhu, X Wang, Y Yang SIGIR 2022, 2022	178	2022
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models W Wu, X Wang, H Luo, J Wang, Y Yang, W Ouyang IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023	141	2023
Large-Scale Video Panoptic Segmentation in the Wild: A Benchmark J Miao, X Wang, Y Wu, W Li, X Zhang, Y Wei, Y Yang IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022	133	2022
Symbiotic attention for egocentric action recognition with object-centric alignment X Wang, L Zhu, Y Wu, Y Yang IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020	117	2020
Bird's-Eye-View Scene Graph for Vision-Language Navigation R Liu, X Wang, W Wang, Y Yang ICCV 2023, 2023	104	2023
Learning to anticipate egocentric actions by imagination Y Wu, L Zhu, X Wang, Y Yang, F Wu IEEE Transactions on Image Processing 30, 1143-1152, 2020	102	2020
Symbiotic attention with privileged information for egocentric action recognition X Wang, Y Wu, L Zhu, Y Yang Proceedings of the AAAI Conference on Artificial Intelligence 34 (07), 12249 …, 2020	98	2020
Parameter-efficient person re-identification in the 3D space Z Zheng, X Wang, N Zheng, Y Yang IEEE Transactions on Neural Networks and Learning Systems 35 (6), 7534-7547, 2022	97	2022
Why are Visually-Grounded Language Models Bad at Image Classification? Y Zhang, A Unell, X Wang, D Ghosh, Y Su, L Schmidt, S Yeung-Levy NeurIPS 2024, 2024	91	2024
Interactive Prototype Learning for Egocentric Action Recognition X Wang, L Zhu, H Wang, Y Yang ICCV 2021, 2021	90	2021
Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation X Shen, Z Yang, X Wang, J Ma, C Zhou, Y Yang IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023	82	2023
Lana: A Language-Capable Navigator for Instruction Following and Generation X Wang, W Wang, J Shao, Y Yang IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023	70	2023
Scalable video object segmentation with identification mechanism Z Yang, J Miao, Y Wei, W Wang, X Wang, Y Yang IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023	69*	2023
Gloss-Free End-to-End Sign Language Translation K Lin, X Wang, L Zhu, K Sun, B Zhang, Y Yang ACL 2023 (Oral), 2023	66	2023
Apollo: An exploration of video understanding in large multimodal models O Zohar, X Wang, Y Dubois, N Mehta, T Xiao, P Hansen-Estruch, L Yu, ... CVPR 2025, https://arxiv.org/pdf/2412.10360, 2025	64	2025
Describing Differences in Image Sets with Natural Language L Dunlap, Y Zhang, X Wang, R Zhong, T Darrell, J Steinhardt, ... CVPR 2024 (Oral), 2024	60	2024
Action Sensitivity Learning for Temporal Action Localization J Shao, X Wang, R Quan, J Zheng, J Yang, Y Yang ICCV 2023, 2023	60	2023

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by