| Drivevlm: The convergence of autonomous driving and large vision-language models X Tian, J Gu, B Li, Y Liu, Y Wang, Z Zhao, K Zhan, P Jia, X Lang, H Zhao arXiv preprint arXiv:2402.12289, 2024 | 408 | 2024 |
| Recondreamer: Crafting world models for driving scene reconstruction via online restoration C Ni, G Zhao, X Wang, Z Zhu, W Qin, G Huang, C Liu, Y Chen, Y Wang, ... Proceedings of the Computer Vision and Pattern Recognition Conference, 1559-1569, 2025 | 56 | 2025 |
| Unleashing generalization of end-to-end autonomous driving with controllable long video generation E Ma, L Zhou, T Tang, Z Zhang, D Han, J Jiang, K Zhan, P Jia, X Lang, ... arXiv preprint arXiv:2406.01349, 2024 | 44 | 2024 |
| Tod3cap: Towards 3d dense captioning in outdoor scenes B Jin, Y Zheng, P Li, W Li, Y Zheng, S Hu, X Liu, J Zhu, Z Yan, H Sun, ... European Conference on Computer Vision, 367-384, 2024 | 30 | 2024 |
| World4drive: End-to-end autonomous driving via intention-aware physical latent world model Y Zheng, P Yang, Z Xing, Q Zhang, Y Zheng, Y Gao, P Li, T Zhang, Z Xia, ... Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2025 | 24 | 2025 |
| Gaussianad: Gaussian-centric end-to-end autonomous driving W Zheng, J Wu, Y Zheng, S Zuo, Z Xie, L Yang, Y Pan, Z Hao, P Jia, ... arXiv preprint arXiv:2412.10371, 2024 | 24 | 2024 |
| Dive: Dit-based video generation with enhanced control J Jiang, G Hong, L Zhou, E Ma, H Hu, X Zhou, J Xiang, F Liu, K Yu, H Sun, ... arXiv preprint arXiv:2409.01595, 2024 | 24 | 2024 |
| BEV-CLIP: Multi-modal BEV retrieval methodology for complex scene in autonomous driving Z Jia, T Gao, C Cai, C Hou, P Jia, F JingChen, Y ZHAO, K Zhan, FU LIU, ... | 18 | 2024 |
| Finetuning generative trajectory model with reinforcement learning from human feedback D Li, J Ren, Y Wang, X Wen, P Li, L Xu, K Zhan, Z Xia, P Jia, X Lang, N Xu, ... arXiv preprint arXiv:2503.10434, 2025 | 17 | 2025 |
| Generalizing motion planners with mixture of experts for autonomous driving Q Sun, H Wang, J Zhan, F Nie, X Wen, L Xu, K Zhan, P Jia, X Lang, ... 2025 IEEE International Conference on Robotics and Automation (ICRA), 6033-6039, 2025 | 16 | 2025 |
| Preliminary investigation into data scaling laws for imitation learning-based end-to-end autonomous driving Y Zheng, Z Xia, Q Zhang, T Zhang, B Lu, X Huo, C Han, Y Li, M Yu, B Jin, ... arXiv preprint arXiv:2412.02689, 2024 | 15 | 2024 |
| TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving X Jiang, Y Ma, P Li, L Xu, X Wen, K Zhan, Z Xia, P Jia, XP Lang, S Sun arXiv preprint arXiv:2505.09315, 2025 | 14 | 2025 |
| Bev-tsr: Text-scene retrieval in bev space for autonomous driving T Tang, D Wei, Z Jia, T Gao, C Cai, C Hou, P Jia, K Zhan, H Sun, ... Proceedings of the AAAI Conference on Artificial Intelligence 39 (7), 7275-7283, 2025 | 14 | 2025 |
| Driveaction: A benchmark for exploring human-like driving decisions in vla models Y Hao, Z Li, L Sun, W Wang, N Yi, S Song, C Qin, M Zhou, Y Zhan, X Lang arXiv preprint arXiv:2506.05667, 2025 | 10 | 2025 |
| Geodrive: 3d geometry-informed driving world model with precise action control A Chen, W Zheng, Y Wang, X Zhang, K Zhan, P Jia, K Keutzer, S Zhang arXiv preprint arXiv:2505.22421, 2025 | 9 | 2025 |
| The better you learn, the smarter you prune: Towards efficient vision-language-action models via differentiable token pruning T Jiang, X Jiang, Y Ma, X Wen, B Li, K Zhan, P Jia, Y Liu, S Sun, X Lang arXiv preprint arXiv:2509.12594, 2025 | 8 | 2025 |
| Ua-track: Uncertainty-aware end-to-end 3d multi-object tracking L Zhou, T Tang, P Hao, Z He, K Ho, S Gu, W Hou, Z Hao, H Sun, K Zhan, ... arXiv e-prints, arXiv: 2406.02147, 2024 | 8 | 2024 |
| Xiaoxiao Long, Yilun Chen, and Hao Zhao. Tod3cap: Towards 3d dense captioning in outdoor scenes B Jin, Y Zheng, P Li, W Li, Y Zheng, S Hu, X Liu, J Zhu, Z Yan, H Sun, ... Computer Vision–ECCV, 367-384, 2024 | 6 | 2024 |
| Other vehicle trajectories are also needed: A driving world model unifies ego-other vehicle trajectories in video latent space J Zhu, Z Jia, T Gao, J Deng, S Li, L Zhang, F Liu, P Jia, X Lang arXiv preprint arXiv:2503.09215, 2025 | 5 | 2025 |
| Kaicheng Yu E Ma, L Zhou, T Tang, Z Zhang, D Han, J Jiang, K Zhan, P Jia, X Lang, ... Unleashing generalization of end-to-end autonomous driving with controllable …, 2024 | 5 | 2024 |