| Sotopia: Interactive evaluation for social intelligence in language agents X Zhou, H Zhu, L Mathur, R Zhang, H Yu, Z Qi, LP Morency, Y Bisk, ... ICLR 2024, 2023 | 288 | 2023 |
| SOTOPIA-: Interactive Learning of Socially Intelligent Language Agents R Wang, H Yu, W Zhang, Z Qi, M Sap, G Neubig, Y Bisk, H Zhu ACL 2024, 2024 | 70 | 2024 |
| Counting the Bugs in ChatGPT's Wugs: A Multilingual Investigation into the Morphological Capabilities of a Large Language Model L Weissweiler, V Hofmann, A Kantharuban, A Cai, R Dutt, A Hengle, ... EMNLP 2023, 2023 | 51 | 2023 |
| MMoE: Enhancing Multimodal Models with Mixtures of Multimodal Interaction Experts H Yu, Z Qi, L Jang, R Salakhutdinov, LP Morency, PP Liang EMNLP 2024, 2024 | 29 | 2024 |
| HEMM: Holistic Evaluation of Multimodal Foundation Models PP Liang, A Goindani, T Chafekar, L Mathur, H Yu, R Salakhutdinov, ... NeurIPS 2024 (D&B Track), 2024 | 27 | 2024 |
| RFiD: Towards Rational Fusion-in-Decoder for Open-Domain Question Answering C Wang, H Yu, Y Zhang ACL 2023 (findings), 2023 | 24 | 2023 |
| ResearchTown: Simulator of Human Research Community H Yu, Z Hong, Z Cheng, K Zhu, K Xuan, J Yao, T Feng, J You ICML 2025, 2024 | 23 | 2024 |
| Synthetic Data RL: Task Definition Is All You Need Y Guo, Z Guo, C Huang, ZA Wang, Z Zhang, H Yu, H Zhang, Y Shen arXiv preprint arXiv:2505.17063, 2025 | 9 | 2025 |
| In-Context Learning May Not Elicit Trustworthy Reasoning: A-Not-B Errors in Pretrained Language Models P Han, P Song, H Yu, J You EMNLP 2024 (findings), 2024 | 9 | 2024 |
| Multi-agent evolve: Llm self-improve through co-evolution Y Chen, Y Wang, S Zhu, H Yu, T Feng, M Zhang, M Patwary, J You arXiv preprint arXiv:2510.23595, 2025 | 8 | 2025 |
| Time-R1: Towards Comprehensive Temporal Reasoning in LLMs Z Liu, P Han, H Yu, H Li, J You arXiv preprint arXiv:2505.13508, 2025 | 8 | 2025 |
| SafeScientist: Enhancing AI Scientist Safety for Risk-Aware Scientific Discovery K Zhu, J Zhang, Z Qi, N Shang, Z Liu, P Han, Y Su, H Yu, J You EMNLP 2025, 2025 | 7* | 2025 |
| TRAMS: Training-free Memory Selection for Long-range Language Modeling H Yu, C Wang, Y Zhang, W Bi EMNLP 2023 (findings), 2023 | 7 | 2023 |
| Uni-Encoder: A Fast and Accurate Response Selection Paradigm for Generation-Based Dialogue Systems C Song, H He, H Yu, P Fang, L Cui, Z Lan ACL 2023 (findings), 2023 | 7 | 2023 |
| Sotopia-RL: Reward Design for Social Intelligence H Yu, Z Qi, Y Zhao, K Nottingham, K Xuan, BP Majumder, H Zhu, ... arXiv preprint arXiv:2508.03905, 2025 | 6* | 2025 |
| Beyond Facts: Evaluating Intent Hallucination in Large Language Models Y Hao, H Yu, J You ACL 2025, 2025 | 5 | 2025 |
| LiveTradeBench: Seeking Real-World Alpha with Large Language Models H Yu, F Li, J You arXiv preprint arXiv:2511.03628, 2025 | 1 | 2025 |
| TinyScientist: An Interactive, Extensible, and Controllable Framework for Building Research Agents H Yu, K Xuan, F Li, K Zhu, Z Lei, J Zhang, Z Qi, K Richardson, J You EMNLP 2025 (Demo Track), 2025 | 1 | 2025 |
| ConsistencyChecker: Tree-based Evaluation of LLM Generalization Capabilities Z Hong, H Yu, J You ACL 2025, 2025 | 1 | 2025 |
| MINT: Multimodal Instruction Tuning with Multimodal Interaction Grouping X Shan, Q Cao, X Han, H Yu, PP Liang arXiv preprint arXiv:2506.02308, 2025 | 1 | 2025 |