[go: up one dir, main page]

Follow
Xiaohan Wang
Title
Cited by
Cited by
Year
Humanity's last exam
L Phan, A Gatti, Z Han, N Li, J Hu, H Zhang, CBC Zhang, M Shaaban, ...
arXiv preprint arXiv:2501.14249, 2025
3012025
T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval
X Wang, L Zhu, Y Yang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
2622021
VideoAgent: Long-form Video Understanding with Large Language Model as Agent
X Wang*, Y Zhang*, O Zohar, S Yeung-Levy
ECCV 2024, 2024
2302024
CenterCLIP: Token Clustering for Efficient Text-Video Retrieval
S Zhao, L Zhu, X Wang, Y Yang
SIGIR 2022, 2022
1782022
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
W Wu, X Wang, H Luo, J Wang, Y Yang, W Ouyang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
1412023
Large-Scale Video Panoptic Segmentation in the Wild: A Benchmark
J Miao, X Wang, Y Wu, W Li, X Zhang, Y Wei, Y Yang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
1332022
Symbiotic attention for egocentric action recognition with object-centric alignment
X Wang, L Zhu, Y Wu, Y Yang
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020
1172020
Bird's-Eye-View Scene Graph for Vision-Language Navigation
R Liu, X Wang, W Wang, Y Yang
ICCV 2023, 2023
1042023
Learning to anticipate egocentric actions by imagination
Y Wu, L Zhu, X Wang, Y Yang, F Wu
IEEE Transactions on Image Processing 30, 1143-1152, 2020
1022020
Symbiotic attention with privileged information for egocentric action recognition
X Wang, Y Wu, L Zhu, Y Yang
Proceedings of the AAAI Conference on Artificial Intelligence 34 (07), 12249 …, 2020
982020
Parameter-efficient person re-identification in the 3D space
Z Zheng, X Wang, N Zheng, Y Yang
IEEE Transactions on Neural Networks and Learning Systems 35 (6), 7534-7547, 2022
972022
Why are Visually-Grounded Language Models Bad at Image Classification?
Y Zhang, A Unell, X Wang, D Ghosh, Y Su, L Schmidt, S Yeung-Levy
NeurIPS 2024, 2024
912024
Interactive Prototype Learning for Egocentric Action Recognition
X Wang, L Zhu, H Wang, Y Yang
ICCV 2021, 2021
902021
Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation
X Shen, Z Yang, X Wang, J Ma, C Zhou, Y Yang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
822023
Lana: A Language-Capable Navigator for Instruction Following and Generation
X Wang, W Wang, J Shao, Y Yang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
702023
Scalable video object segmentation with identification mechanism
Z Yang, J Miao, Y Wei, W Wang, X Wang, Y Yang
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023
69*2023
Gloss-Free End-to-End Sign Language Translation
K Lin, X Wang, L Zhu, K Sun, B Zhang, Y Yang
ACL 2023 (Oral), 2023
662023
Apollo: An exploration of video understanding in large multimodal models
O Zohar, X Wang, Y Dubois, N Mehta, T Xiao, P Hansen-Estruch, L Yu, ...
CVPR 2025, https://arxiv.org/pdf/2412.10360, 2025
642025
Describing Differences in Image Sets with Natural Language
L Dunlap*, Y Zhang*, X Wang, R Zhong, T Darrell, J Steinhardt, ...
CVPR 2024 (Oral), 2024
602024
Action Sensitivity Learning for Temporal Action Localization
J Shao, X Wang, R Quan, J Zheng, J Yang, Y Yang
ICCV 2023, 2023
602023
The system can't perform the operation now. Try again later.
Articles 1–20