[go: up one dir, main page]

Follow
Rongwu Xu
Title
Cited by
Cited by
Year
Humanity's last exam
L Phan, A Gatti, Z Han, N Li, J Hu, H Zhang, CBC Zhang, M Shaaban, ...
arXiv preprint arXiv:2501.14249, 2025
3062025
Knowledge Conflicts for LLMs: A Survey
R Xu, Z Qi, C Wang, H Wang, Y Zhang, W Xu
EMNLP, 2024
2302024
The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation
R Xu, BS Lin, S Yang, T Zhang, W Shi, T Zhang, Z Fang, W Xu, H Qiu
🏆 ACL Outstanding Paper, 2023
1212023
How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States
Z Zhou, H Yu, X Zhang, R Xu, F Huang, Y Li
EMNLP, 2024
712024
MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs
Z Zeng, Y Liu, Y Wan, J Li, P Chen, J Dai, Y Yao, R Xu, Z Qi, W Zhao, ...
NeurIPS, 2024
43*2024
On the role of attention heads in large language model safety
Z Zhou, H Yu, X Zhang, R Xu, F Huang, K Wang, Y Liu, J Fang, Y Li
ICLR Oral, 2024
392024
Long RAG: Evaluating Long-Context & Long-Form Retrieval-Augmented Generation with Key Point Recall
Z Qi*, R Xu*, Z Guo, C Wang, H Zhang, W Xu
EMNLP, 2024
262024
Walking in Others' Shoes: How Perspective-Taking Guides Large Language Models in Reducing Toxicity and Bias
R Xu, Z Zhou, T Zhang, Z Qi, S Yao, K Xu, W Xu, H Qiu
EMNLP, 2024
242024
Preemptive Answer" Attacks" on Chain-of-Thought Reasoning
R Xu, Z Qi, W Xu
ACL, 2024
212024
Nuclear Deployed: Analyzing Catastrophic Risks in Decision-making of Autonomous LLM Agents
R Xu, X Li, S Chen, W Xu
ACL, 2025
202025
MISO: legacy-compatible privacy-preserving single sign-on using trusted execution environments
R Xu, S Yang, F Zhang, Z Fang
EuroS&P, 2023
182023
Debateqa: Evaluating question answering on debatable knowledge
R Xu, X Qi, Z Qi, W Xu, Z Guo
EACL, 2024
142024
Course-Correction: Safety Alignment Using Synthetic Preferences
R Xu, Y Cai, Z Zhou, R Gu, H Weng, Y Liu, T Zhang, W Xu, H Qiu
EMNLP, 2024
122024
The singapore consensus on global ai safety research priorities
Y Bengio, T Maharaj, L Ong, S Russell, D Song, M Tegmark, L Xue, ...
arXiv preprint arXiv:2506.20702, 2025
7*2025
Ai awareness
X Li, H Shi, R Xu, W Xu
arXiv preprint arXiv:2504.20084, 2025
72025
Tempo: Confidentiality Preservation in Cloud-Based Neural Network Training
R Xu, Z Fang
IJCNN, 2024
52024
LSync: A universal event-synchronizing solution for live streaming
Y Xu, F Dang, R Xu, X Chen, Y Liu
INFOCOM, 2022
52022
Aicrypto: A comprehensive benchmark for evaluating cryptography capabilities of large language models
Y Wang, Y Liu, L Ji, H Luo, W Li, X Zhou, C Feng, P Wang, Y Cao, ...
arXiv preprint arXiv:2507.09580, 2025
42025
Rules created by symbolic systems cannot constrain a learning system
SW Lin, R Xu, X Li, W Xu
Available at SSRN, 2025
42025
Liferec: A mobile app for lifelog recording and ubiquitous recommendation
J Li, H Zhang, Z He, R Xu, P Wu, M Zhang, Y Liu, S Ma
CHIIR, 2022
42022
The system can't perform the operation now. Try again later.
Articles 1–20