Peng Jia

Cited by

	All	Since 2021
Citations	780	780
h-index	13	13
i10-index	14	14

660

330

165

495

20242025202699 655 23

Co-authors

Kun ZhanAI Researcher, LiAutoVerified email at lixiang.com
Hang ZhaoAssistant Professor, Tsinghua UniversityVerified email at csail.mit.edu
Kaicheng YuAssistant Professor, Westlake University, PI of Autonomous Intelligence LabVerified email at westlake.edu.cn
Xiaodan LiangProfessor of Computer Science, Sun Yat-sen University, MBZUAI, CMU, NUSVerified email at mail2.sysu.edu.cn
Shanghang ZhangPeking UniversityVerified email at pku.edu.cn
Dongbin ZhaoInstitute of Automation, Chinese Academy of SciencesVerified email at ia.ac.cn
Hao ZhaoTsinghua UniversityVerified email at air.tsinghua.edu.cn
jiaming liuPHD of Peking UniversityVerified email at bupt.edu.cn
Yue WangUSCVerified email at csail.mit.edu

Peng Jia

CEO, Simplexity Robotics

Verified email at s-robots.com

Embodied AI


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Drivevlm: The convergence of autonomous driving and large vision-language models X Tian, J Gu, B Li, Y Liu, Y Wang, Z Zhao, K Zhan, P Jia, X Lang, H Zhao arXiv preprint arXiv:2402.12289, 2024	408	2024
Recondreamer: Crafting world models for driving scene reconstruction via online restoration C Ni, G Zhao, X Wang, Z Zhu, W Qin, G Huang, C Liu, Y Chen, Y Wang, ... Proceedings of the Computer Vision and Pattern Recognition Conference, 1559-1569, 2025	56	2025
Unleashing generalization of end-to-end autonomous driving with controllable long video generation E Ma, L Zhou, T Tang, Z Zhang, D Han, J Jiang, K Zhan, P Jia, X Lang, ... arXiv preprint arXiv:2406.01349, 2024	44	2024
Tod3cap: Towards 3d dense captioning in outdoor scenes B Jin, Y Zheng, P Li, W Li, Y Zheng, S Hu, X Liu, J Zhu, Z Yan, H Sun, ... European Conference on Computer Vision, 367-384, 2024	30	2024
World4drive: End-to-end autonomous driving via intention-aware physical latent world model Y Zheng, P Yang, Z Xing, Q Zhang, Y Zheng, Y Gao, P Li, T Zhang, Z Xia, ... Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2025	24	2025
Gaussianad: Gaussian-centric end-to-end autonomous driving W Zheng, J Wu, Y Zheng, S Zuo, Z Xie, L Yang, Y Pan, Z Hao, P Jia, ... arXiv preprint arXiv:2412.10371, 2024	24	2024
Dive: Dit-based video generation with enhanced control J Jiang, G Hong, L Zhou, E Ma, H Hu, X Zhou, J Xiang, F Liu, K Yu, H Sun, ... arXiv preprint arXiv:2409.01595, 2024	24	2024
BEV-CLIP: Multi-modal BEV retrieval methodology for complex scene in autonomous driving Z Jia, T Gao, C Cai, C Hou, P Jia, F JingChen, Y ZHAO, K Zhan, FU LIU, ...	18	2024
Finetuning generative trajectory model with reinforcement learning from human feedback D Li, J Ren, Y Wang, X Wen, P Li, L Xu, K Zhan, Z Xia, P Jia, X Lang, N Xu, ... arXiv preprint arXiv:2503.10434, 2025	17	2025
Generalizing motion planners with mixture of experts for autonomous driving Q Sun, H Wang, J Zhan, F Nie, X Wen, L Xu, K Zhan, P Jia, X Lang, ... 2025 IEEE International Conference on Robotics and Automation (ICRA), 6033-6039, 2025	16	2025
Preliminary investigation into data scaling laws for imitation learning-based end-to-end autonomous driving Y Zheng, Z Xia, Q Zhang, T Zhang, B Lu, X Huo, C Han, Y Li, M Yu, B Jin, ... arXiv preprint arXiv:2412.02689, 2024	15	2024
TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving X Jiang, Y Ma, P Li, L Xu, X Wen, K Zhan, Z Xia, P Jia, XP Lang, S Sun arXiv preprint arXiv:2505.09315, 2025	14	2025
Bev-tsr: Text-scene retrieval in bev space for autonomous driving T Tang, D Wei, Z Jia, T Gao, C Cai, C Hou, P Jia, K Zhan, H Sun, ... Proceedings of the AAAI Conference on Artificial Intelligence 39 (7), 7275-7283, 2025	14	2025
Driveaction: A benchmark for exploring human-like driving decisions in vla models Y Hao, Z Li, L Sun, W Wang, N Yi, S Song, C Qin, M Zhou, Y Zhan, X Lang arXiv preprint arXiv:2506.05667, 2025	10	2025
Geodrive: 3d geometry-informed driving world model with precise action control A Chen, W Zheng, Y Wang, X Zhang, K Zhan, P Jia, K Keutzer, S Zhang arXiv preprint arXiv:2505.22421, 2025	9	2025
The better you learn, the smarter you prune: Towards efficient vision-language-action models via differentiable token pruning T Jiang, X Jiang, Y Ma, X Wen, B Li, K Zhan, P Jia, Y Liu, S Sun, X Lang arXiv preprint arXiv:2509.12594, 2025	8	2025
Ua-track: Uncertainty-aware end-to-end 3d multi-object tracking L Zhou, T Tang, P Hao, Z He, K Ho, S Gu, W Hou, Z Hao, H Sun, K Zhan, ... arXiv e-prints, arXiv: 2406.02147, 2024	8	2024
Xiaoxiao Long, Yilun Chen, and Hao Zhao. Tod3cap: Towards 3d dense captioning in outdoor scenes B Jin, Y Zheng, P Li, W Li, Y Zheng, S Hu, X Liu, J Zhu, Z Yan, H Sun, ... Computer Vision–ECCV, 367-384, 2024	6	2024
Other vehicle trajectories are also needed: A driving world model unifies ego-other vehicle trajectories in video latent space J Zhu, Z Jia, T Gao, J Deng, S Li, L Zhang, F Liu, P Jia, X Lang arXiv preprint arXiv:2503.09215, 2025	5	2025
Kaicheng Yu E Ma, L Zhou, T Tang, Z Zhang, D Han, J Jiang, K Zhan, P Jia, X Lang, ... Unleashing generalization of end-to-end autonomous driving with controllable …, 2024	5	2024

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors