Xiao Hu

Cited by

	All	Since 2021
Citations	373	362
h-index	10	10
i10-index	10	10

220

110

165

20202021202220232024202520269 23 10 27 72 219 11

Public access

View all

1 article

5 articles

available

not available

Based on funding mandates

Co-authors

Xianyuan ZhanAssociate Professor, Institute for AI Industry Research (AIR), Tsinghua UniversityVerified email at air.tsinghua.edu.cn
Jianxiong LiTsinghua UniversityVerified email at mails.tsinghua.edu.cn
YA-QIN ZHANGTsinghua Univ，Microsoft, Baidu, SarnoffVerified email at air.tsinghua.edu.cn
Yi-Fan ZhangInstitute of Automation, Chinese Academy of SciencesVerified email at ia.ac.cn
Guorui ZhouVerified email at kuaishou.com
Ni MuTsinghua UniversityVerified email at mails.tsinghua.edu.cn
(Samuel) Qing-Shan JiaTsinghua UniversityVerified email at tsinghua.edu.cn
Haoyi NiuUC BerkeleyVerified email at berkeley.edu
Yinan ZhengTsinghua UniversityVerified email at mails.tsinghua.edu.cn
Jizhi ZhangPhD candidate @ USTCVerified email at mail.ustc.edu.cn

Xiao Hu

Tsinghua University

Verified email at mails.tsinghua.edu.cn

LLM Reinforcement Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Fault diagnosis using novel AdaBoost based discriminant locality preserving projection with resamples YL He, Y Zhao, X Hu, XN Yan, QX Zhu, Y Xu Engineering Applications of Artificial Intelligence 91, 103631, 2020	71	2020
Thyme: Think Beyond Images YF Zhang, X Lu, S Yin, C Fu, W Chen, X Hu, B Wen, K Jiang, C Liu, ... arXiv preprint arXiv:2508.11630, 2025	61*	2025
PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement Learning J Li, X Hu, H Xu, J Liu, X Zhan, YQ Zhang arXiv preprint arXiv:2305.15669, 2023	39	2023
Query-Policy Misalignment in Preference-Based Reinforcement Learning X Hu, J Li, X Zhan, QS Jia, YQ Zhang International Conference on Learning Representations (ICLR), 2024, Spotlight, 2023	36	2023
Kwai Keye-VL Technical Report KK Team, B Yang, B Wen, C Liu, C Chu, C Song, C Rao, C Yi, D Li, ... arXiv preprint arXiv:2507.01949, 2025	29	2025
Mind the gap: Offline policy optimization for imperfect rewards J Li, X Hu, H Xu, J Liu, X Zhan, QS Jia, YQ Zhang International Conference on Learning Representations (ICLR), 2023, 2023	28	2023
R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning YF Zhang, X Lu, X Hu, C Fu, B Wen, T Zhang, C Liu, K Jiang, K Chen, ... arXiv preprint arXiv:2505.02835, 2025	25	2025
Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning S Huang, Q Gallouédec, F Felten, A Raffin, RFJ Dossa, Y Zhao, ... arXiv preprint arXiv:2402.03046, 2024	19	2024
DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning J Li, J Zheng, Y Zheng, L Mao, X Hu, S Cheng, H Niu, J Liu, Y Liu, J Liu, ... ICML 2024, 2024	15	2024
Kwai Keye-VL 1.5 Technical Report B Yang, B Wen, B Ding, C Liu, C Chu, C Song, C Rao, C Yi, D Li, D Zang, ... arXiv preprint arXiv:2509.01563, 2025	14	2025
Data Center Cooling System Optimization Using Offline Reinforcement Learning X Zhan, X Zhu, P Cheng, X Hu, Z He, H Geng, J Leng, H Zheng, C Liu, ... ICLR 2025, 2025	9	2025
Why Distillation can Outperform Zero-RL: The Role of Flexible Reasoning X Hu, X Lu, L Mao, YF Zhang, T Zhang, B Wen, F Yang, T Gao, G Zhou arXiv preprint arXiv:2505.21067, 2025	7	2025
Large-Scale Data Center Cooling Control via Sample-Efficient Reinforcement Learning N Mu, X Hu, QS Jia, X Zhu, X He 2024 IEEE 20th International Conference on Automation Science and …, 2024	7	2024
Integrating Mechanism and Data: Reinforcement Learning Based on Multi-Fidelity Model for Data Center Cooling Control N Mu, X Hu, QS Jia 2023 China Automation Congress (CAC), 5283-5288, 2023	5	2023
CLARIFY: Contrastive Preference Reinforcement Learning for Untangling Ambiguous Queries N Mu, H Hu, X Hu, Y Yang, B Xu, QS Jia ICML 2025, 2025	4	2025
Novel L2-Discriminant Locality Preserving Projection Integrated with Adaboost and Its Application to Fault Diagnosis X Hu, Y Zhao, Y Xu, YL He, QX Zhu 2020 IEEE 9th Data Driven Control and Learning Systems Conference (DDCLS …, 2020	2	2020
Simulation and AI for Critical Infrastructure QS Jia, C Duan, S Feng, Y Zhu, X Hu 2024 Winter Simulation Conference (WSC), 57-71, 2024	1	2024
Vehicle Extreme Control based on Offline Reinforcement Leaning S Zhao, J Li, X Hu, J Zhang, C He 2022 China Automation Congress (CAC), 4539-4543, 2022	1	2022
面向数据中心绿色可靠运行的强化学习方法贾庆山，唐静娴，吴俊杰，胡潇，林依挺，夏恒智能科学与技术学报 2 (4), 341-347, 0

The system can't perform the operation now. Try again later.

Articles 1–19

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors