Rongwu Xu

Cited by

	All	Since 2021
Citations	992	991
h-index	12	12
i10-index	13	13

820

410

205

615

20232024202520266 125 817 37

Public access

View all

5 articles

0 articles

available

not available

Based on funding mandates

Rongwu Xu

University of Washington

Verified email at cs.washington.edu - Homepage

artificial intelligence natural language processing psychology cognitive science alignment


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Humanity's last exam L Phan, A Gatti, Z Han, N Li, J Hu, H Zhang, CBC Zhang, M Shaaban, ... arXiv preprint arXiv:2501.14249, 2025	306	2025
Knowledge Conflicts for LLMs: A Survey R Xu, Z Qi, C Wang, H Wang, Y Zhang, W Xu EMNLP, 2024	230	2024
The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation R Xu, BS Lin, S Yang, T Zhang, W Shi, T Zhang, Z Fang, W Xu, H Qiu 🏆 ACL Outstanding Paper, 2023	121	2023
How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States Z Zhou, H Yu, X Zhang, R Xu, F Huang, Y Li EMNLP, 2024	71	2024
MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs Z Zeng, Y Liu, Y Wan, J Li, P Chen, J Dai, Y Yao, R Xu, Z Qi, W Zhao, ... NeurIPS, 2024	43*	2024
On the role of attention heads in large language model safety Z Zhou, H Yu, X Zhang, R Xu, F Huang, K Wang, Y Liu, J Fang, Y Li ICLR Oral, 2024	39	2024
Long RAG: Evaluating Long-Context & Long-Form Retrieval-Augmented Generation with Key Point Recall Z Qi, R Xu, Z Guo, C Wang, H Zhang, W Xu EMNLP, 2024	26	2024
Walking in Others' Shoes: How Perspective-Taking Guides Large Language Models in Reducing Toxicity and Bias R Xu, Z Zhou, T Zhang, Z Qi, S Yao, K Xu, W Xu, H Qiu EMNLP, 2024	24	2024
Preemptive Answer" Attacks" on Chain-of-Thought Reasoning R Xu, Z Qi, W Xu ACL, 2024	21	2024
Nuclear Deployed: Analyzing Catastrophic Risks in Decision-making of Autonomous LLM Agents R Xu, X Li, S Chen, W Xu ACL, 2025	20	2025
MISO: legacy-compatible privacy-preserving single sign-on using trusted execution environments R Xu, S Yang, F Zhang, Z Fang EuroS&P, 2023	18	2023
Debateqa: Evaluating question answering on debatable knowledge R Xu, X Qi, Z Qi, W Xu, Z Guo EACL, 2024	14	2024
Course-Correction: Safety Alignment Using Synthetic Preferences R Xu, Y Cai, Z Zhou, R Gu, H Weng, Y Liu, T Zhang, W Xu, H Qiu EMNLP, 2024	12	2024
The singapore consensus on global ai safety research priorities Y Bengio, T Maharaj, L Ong, S Russell, D Song, M Tegmark, L Xue, ... arXiv preprint arXiv:2506.20702, 2025	7*	2025
Ai awareness X Li, H Shi, R Xu, W Xu arXiv preprint arXiv:2504.20084, 2025	7	2025
Tempo: Confidentiality Preservation in Cloud-Based Neural Network Training R Xu, Z Fang IJCNN, 2024	5	2024
LSync: A universal event-synchronizing solution for live streaming Y Xu, F Dang, R Xu, X Chen, Y Liu INFOCOM, 2022	5	2022
Aicrypto: A comprehensive benchmark for evaluating cryptography capabilities of large language models Y Wang, Y Liu, L Ji, H Luo, W Li, X Zhou, C Feng, P Wang, Y Cao, ... arXiv preprint arXiv:2507.09580, 2025	4	2025
Rules created by symbolic systems cannot constrain a learning system SW Lin, R Xu, X Li, W Xu Available at SSRN, 2025	4	2025
Liferec: A mobile app for lifelog recording and ubiquitous recommendation J Li, H Zhang, Z He, R Xu, P Wu, M Zhang, Y Liu, S Ma CHIIR, 2022	4	2022

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by