| Amplegcg: Learning a universal and transferable generative model of adversarial suffixes for jailbreaking both open and closed llms Z Liao, H Sun COLM 2024, 2024 | 159 | 2024 |
| Chatcounselor: A large language models for mental health support JM Liu, D Li, H Cao, T Ren, Z Liao, J Wu CIKM 2023, 2023 | 155 | 2023 |
| Eia: Environmental injection attack on generalist web agents for privacy leakage Z Liao, L Mo, C Xu, M Kang, J Zhang, C Xiao, Y Tian, B Li, H Sun ICLR 2025, 2024 | 113 | 2024 |
| Scienceagentbench: Toward rigorous assessment of language agents for data-driven scientific discovery Z Chen, S Chen, Y Ning, Q Zhang, B Wang, B Yu, Y Li, Z Liao, C Wei, Z Lu, ... ICLR 2025, 2024 | 102 | 2024 |
| Introducing v0. 5 of the ai safety benchmark from mlcommons B Vidgen, A Agrawal, AM Ahmed, V Akinwande, N Al-Nuaimi, N Alfaraj, ... arXiv preprint arXiv:2404.12241, 2024 | 72 | 2024 |
| Advweb: Controllable black-box attacks on vlm-powered web agents C Xu, M Kang, J Zhang, Z Liao, L Mo, M Yuan, H Sun, B Li ICML 2025, 2024 | 54* | 2024 |
| RobustLR: A diagnostic benchmark for evaluating logical robustness of deductive reasoners S Sanyal, Z Liao, X Ren EMNLP 2022, 9614-9631, 2022 | 28* | 2022 |
| A trembling house of cards? mapping adversarial attacks against language agents L Mo, Z Liao, B Zheng, Y Su, C Xiao, H Sun arXiv preprint arXiv:2402.10196, 2024 | 22 | 2024 |
| AttributionBench: How Hard is Automatic Attribution Evaluation? Y Li, X Yue, Z Liao, H Sun ACL 2024, 2024 | 19 | 2024 |
| Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge B Gou, Z Huang, Y Ning, Y Gu, M Lin, W Qi, A Kopanev, B Yu, ... arXiv preprint arXiv:2506.21506, 2025 | 18 | 2025 |
| ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery, October 2024 Z Chen, S Chen, Y Ning, Q Zhang, B Wang, B Yu, Y Li, Z Liao, C Wei, Z Lu, ... URL https://arxiv. org/abs/2410.05080 v1, 2024 | 13 | 2024 |
| Amplegcg-plus: A strong generative model of adversarial suffixes to jailbreak llms with higher success rates in fewer attempts V Kumar, Z Liao, J Jones, H Sun arXiv preprint arXiv:2410.22143, 2024 | 11 | 2024 |
| In Search of the Long-Tail: Systematic Generation of Long-Tail Inferential Knowledge via Logical Rule Guided Search H Li, Y Ning, Z Liao, S Wang, XL Li, X Lu, W Zhao, F Brahman, Y Choi, ... EMNLP 2024, 2023 | 8* | 2023 |
| Mind2web 2: Evaluating agentic search with agent-as-a-judge, 2025 B Gou, Z Huang, Y Ning, Y Gu, M Lin, W Qi, A Kopanev, B Yu, ... URL https://arxiv. org/abs/2506.21506, 2025 | 7 | 2025 |
| Agent learning via early experience K Zhang, X Chen, B Liu, T Xue, Z Liao, Z Liu, X Wang, Y Ning, Z Chen, ... arXiv preprint arXiv:2510.08558, 2025 | 6 | 2025 |
| Redteamcua: Realistic adversarial testing of computer-use agents in hybrid web-os environments Z Liao, J Jones, L Jiang, Y Ning, E Fosler-Lussier, Y Su, Z Lin, H Sun arXiv preprint arXiv:2505.21936, 2025 | 6 | 2025 |
| Joint demonstration and preference learning improves policy alignment with human feedback C Li, S Zeng, Z Liao, J Li, D Kang, A Garcia, M Hong ICLR 2025 (SpotLight), 2024 | 6 | 2024 |
| Introducing v0. 5 of the ai safety benchmark from mlcommons, 2024 B Vidgen, A Agrawal, AM Ahmed, V Akinwande, N Al-Nuaimi, N Alfaraj, ... URL https://arxiv. org/abs/2404.12241, 0 | 6 | |
| Advweb: Controllable black-box attacks on vlm-powered web agents, 2024 C Xu, M Kang, J Zhang, Z Liao, L Mo, M Yuan, H Sun, B Li URL https://arxiv. org/abs/2410.17401, 0 | 5 | |
| Joint Reward and Policy Learning with Demonstrations and Human Feedback Improves Alignment C Li, S Zeng, Z Liao, J Li, D Kang, A Garcia, M Hong The Thirteenth International Conference on Learning Representations, 0 | 3 | |