| The llama 3 herd of models A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ... arXiv e-prints, arXiv: 2407.21783, 2024 | 12293* | 2024 |
| Deep supervised hashing with triplet labels X Wang, Y Shi, KM Kitani ACCV, 2016 | 280 | 2016 |
| Emu: Enhancing image generation models using photogenic needles in a haystack X Dai, J Hou, CY Ma, S Tsai, J Wang, R Wang, P Zhang, S Vandenhende, ... arXiv preprint arXiv:2309.15807, 2023 | 275 | 2023 |
| The llama 4 herd: The beginning of a new era of natively multimodal ai innovation AI Meta https://ai. meta. com/blog/llama-4-multimodal-intelligence/, checked on 4 (7 …, 2025 | 249 | 2025 |
| Wisdom of Committees: An Overlooked Approach To Faster and More Accurate Models X Wang, D Kondratyuk, E Christiansen, KM Kitani, Y Alon, E Eban ICLR, 2022 | 93* | 2022 |
| Apollo: An exploration of video understanding in large multimodal models O Zohar, X Wang, Y Dubois, N Mehta, T Xiao, P Hansen-Estruch, L Yu, ... Proceedings of the Computer Vision and Pattern Recognition Conference, 18891 …, 2025 | 64 | 2025 |
| Controlroom3d: Room generation using semantic proxy rooms J Schult, S Tsai, L Höllein, B Wu, J Wang, CY Ma, K Li, X Wang, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 58 | 2024 |
| AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification X Wang, X Xiong, M Neumann, AJ Piergiovanni, MS Ryoo, A Angelova, ... ECCV, 2020 | 56 | 2020 |
| Learnable embedding space for efficient neural architecture compression S Cao, X Wang, KM Kitani ICLR, 2019 | 53 | 2019 |
| Hamming Compatible Quantization for Hashing Z Wang, LY Duan, J Lin, X Wang, T Huang, W Gao IJCAI, 2015 | 29 | 2015 |
| Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction S Zhao, Z Wang, F Juefei-Xu, X Xia, M Liu, X Wang, M Liang, N Zhang, ... Proceedings of the Computer Vision and Pattern Recognition Conference, 29869 …, 2025 | 16 | 2025 |
| Contextual visual similarity X Wang, KM Kitani, M Hebert arXiv preprint arXiv:1612.02534, 2016 | 12 | 2016 |
| Error Correction Maximization for Deep Image Hashing X Xu, X Wang, KM Kitani BMVC, 2018 | 8 | 2018 |
| Neighborhood-Aware Neural Architecture Search X Wang, S Cao, M Li, KM Kitani BMVC, 2021 | 7 | 2021 |
| Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs Z Huang, Y Ji, X Wang, N Mehta, T Xiao, D Lee, S Vanvalkenburgh, S Zha, ... Proceedings of the Computer Vision and Pattern Recognition Conference, 24169 …, 2025 | 6 | 2025 |
| Cost-Aware Evaluation and Model Scaling for LiDAR-Based 3D Object Detection X Wang, KM Kitani arXiv preprint arXiv:2205.01142, 2022 | 5 | 2022 |
| Efficient model performance estimation via feature histories S Cao, X Wang, K Kitani arXiv preprint arXiv:2103.04450, 2021 | 2 | 2021 |