Ziniu Li

Cited by

	All	Since 2021
Citations	1072	1067
h-index	17	17
i10-index	23	23

660

330

165

495

20202021202220232024202520263 14 39 97 238 647 31

Public access

View all

9 articles

2 articles

available

not available

Based on funding mandates

Co-authors

Tian XuNanjing UniversityVerified email at lamda.nju.edu.cn
Yang YuProfessor, Nanjing UniversityVerified email at nju.edu.cn
Ruoyu SunChinese University of Hong Kong (Shenzhen), Shenzhen Institue of Big DataVerified email at cuhk.edu.cn
Zhi-Quan LuoProfessor, The Chinese University of Hong Kong, Shenzhen, ChinaVerified email at cuhk.edu.cn
Tian DingShenzhen Research Institute of Big DataVerified email at sribd.cn
Yushun ZhangThe Chinese University of Hong Kong, Shenzhen, ChinaVerified email at link.cuhk.edu.cn
Congliang ChenResearch Assistant Professor, Shenzhen Loop Area InstituteVerified email at slai.edu.cn
Zeyu QinHong Kong University of Science and TechnologyVerified email at connect.ust.hk
Jiancong XiaoUniversity of PennsylvaniaVerified email at upenn.edu
Zhengyang TangThe Chinese University of Hong Kong, ShenzhenVerified email at link.cuhk.edu.cn
Ge ZhangM-A-P, Bytedance Seed, University of WaterlooVerified email at bytedance.com
Diederik P. KingmaAnthropicVerified email at anthropic.com
Tong ZhangUIUCVerified email at tongzhang-ml.org
Weijie SuAssociate Professor, University of PennsylvaniaVerified email at wharton.upenn.edu
Xueyao ZhangThe Chinese University of Hong Kong, ShenzhenVerified email at link.cuhk.edu.cn
Tianyun YangShenzhen Research Institute of Big Data
Yoshua BengioProfessor of computer science, University of Montreal, Mila, IVADO, CIFARVerified email at umontreal.ca

Ziniu Li

Other namesZi-Niu Li

The Chinese University of Hong Kong, Shenzhen

Verified email at link.cuhk.edu.cn - Homepage

Machine Learning Reinforcement Learning Large Language Models


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models Z Li, T Xu, Y Zhang, Z Lin, Y Yu, R Sun, ZQ Luo International Conference on Machine Learning (ICML), 2024	187*	2024
Error bounds of imitating policies and environments T Xu, Z Li, Y Yu Advances in Neural Information Processing Systems (NeurIPS) 33, 15737-15749, 2020	140*	2020
Adam-mini: Use fewer learning rates to gain more Y Zhang, C Chen, Z Li, T Ding, C Wu, DP Kingma, Y Ye, ZQ Luo, R Sun International Conference on Learning Representations (ICLR), 2025	101*	2025
Why transformers need adam: A hessian perspective Y Zhang, C Chen, T Ding, Z Li, R Sun, ZQ Luo Neural Information Processing System (NeurIPS), 2024	98	2024
Preserving diversity in supervised fine-tuning of large language models Z Li, C Chen, T Xu, Z Qin, J Xiao, ZQ Luo, R Sun International Conference on Learning Representations (ICLR), 2025	61*	2025
On the algorithmic bias of aligning large language models with rlhf: Preference collapse and matching regularization J Xiao, Z Li, X Xie, E Getzen, C Fang, Q Long, WJ Su Journal of the American Statistical Association, 1-21, 2025	57	2025
Error bounds of imitating policies and environments for reinforcement learning T Xu, Z Li, Y Yu IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (10), 6968 …, 2021	53	2021
When is RL better than DPO in RLHF? A Representation and Optimization Perspective Z Li, T Xu, Y Yu Tiny Paper of International Conference on Learning Representations (ICLR), 2024	42*	2024
Treepo: Bridging the gap of policy optimization and efficacy and inference efficiency with heuristic tree-based modeling Y Li, Q Gu, Z Wen, Z Li, T Xing, S Guo, T Zheng, X Zhou, X Qu, W Zhou, ... arXiv preprint arXiv:2508.17445, 2025	26*	2025
Understanding and Mitigating Hallucination in Large Vision-Language Models via Modular Attribution and Intervention T Yang, Z Li, J Cao, C Xu International Conference on Learning Representations (ICLR), 2025	26*	2025
HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning Z Li, Y Li, Y Zhang, T Zhang, ZQ Luo International Conference on Learning Representations (ICLR), 2022	26	2022
Imitation learning from imperfection: Theoretical justifications and algorithms Z Li, T Xu, Z Qin, Y Yu, ZQ Luo Neural Information Processing System (NeurIPS), 2023	25	2023
Self-Guided Evolution Strategies with Historical Estimated Gradients FY Liu, ZN Li, C Qian International Joint Conferences on Artificial Intelligence (IJCAI), 2020	25	2020
A survey on large language models for mathematical reasoning PY Wang, TS Liu, C Wang, Z Li, Y Wang, S Yan, C Jia, XH Liu, X Chen, ... ACM Computing Surveys, 2025	22	2025
Understanding adversarial imitation learning in small sample regime: A stage-coupled analysis T Xu, Z Li, Y Yu, ZQ Luo arXiv preprint arXiv:2208.01899, 2022	21*	2022
Provably Efficient Adversarial Imitation Learning with Unknown Transitions T Xu, Z Li, Y Yu, ZQ Luo Conference on Uncertainty in Artificial Intelligence (UAI), 2023	19	2023
Rethinking ValueDice - Does It Really Improve Performance? Z Li, T Xu, Y Yu, ZQ Luo Blog of International Conference on Learning Representations (ICLR), 2022	19	2022
Seed-oss open-source models BDS Team	15	2025
Self-Evolving Critique Abilities in Large Language Models Z Tang, Z Li, Z Xiao*, T Ding, R Sun, B Wang, D Liu, F Huang, T Liu, ... Second Conference on Language Modeling, 2025	14*	2025
Advancing zero-shot text-to-speech intelligibility across diverse domains via preference alignment X Zhang, Y Wang, C Wang, Z Li, Z Chen, Z Wu Proceedings of the 63rd Annual Meeting of the Association for Computational …, 2025	12	2025

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors