| Volcano: mitigating multimodal hallucination through self-feedback guided revision S Lee, SH Park, Y Jo, M Seo Proceedings of the 2024 Conference of the North American Chapter of the …, 2024 | 102 | 2024 |
| Prometheus-vision: Vision-language model as a judge for fine-grained evaluation S Lee, S Kim, S Park, G Kim, M Seo Findings of the association for computational linguistics ACL 2024, 11286-11315, 2024 | 72 | 2024 |
| Aligning to thousands of preferences via system message generalization S Lee, SH Park, S Kim, M Seo Advances in Neural Information Processing Systems 37, 73783-73829, 2024 | 66 | 2024 |
| Paper2code: Automating code generation from scientific papers in machine learning M Seo, J Baek, S Lee, SJ Hwang arXiv preprint arXiv:2504.17192, 2025 | 31 | 2025 |
| LIQUID: A framework for list question answering dataset generation S Lee, H Kim, J Kang Proceedings of the AAAI Conference on Artificial Intelligence 37 (11), 13014 …, 2023 | 29 | 2023 |
| Evaluating language models as synthetic data generators S Kim, J Suk, X Yue, V Viswanathan, S Lee, Y Wang, K Gashteovski, ... Proceedings of the 63rd Annual Meeting of the Association for Computational …, 2025 | 25 | 2025 |
| The biggen bench: A principled benchmark for fine-grained evaluation of language models with language models S Kim, J Suk, JY Cho, S Longpre, C Kim, D Yoon, G Son, Y Cho, ... Proceedings of the 2025 Conference of the Nations of the Americas Chapter of …, 2025 | 17 | 2025 |
| How does vision-language adaptation impact the safety of vision language models? S Lee, G Kim, J Kim, H Lee, H Chang, SH Park, M Seo arXiv preprint arXiv:2410.07571, 2024 | 9 | 2024 |
| Zero-shot dense video captioning by jointly optimizing text and moment Y Jo, S Lee, ASJ Lee, H Lee, H Oh, M Seo arXiv preprint arXiv:2307.02682, 2023 | 9 | 2023 |
| Scaling evaluation-time compute with reasoning models as process evaluators S Kim, I Wu, J Lee, X Yue, S Lee, M Moon, K Gashteovski, C Lawrence, ... arXiv preprint arXiv:2503.19877, 2025 | 7 | 2025 |
| Lg ai research & kaist at ehrsql 2024: Self-training large language models with pseudo-labeled unanswerable questions for a reliable text-to-sql system on ehrs Y Jo, S Lee, M Seo, SJ Hwang, M Lee arXiv preprint arXiv:2405.11162, 2024 | 6 | 2024 |
| Efficient long context language model retrieval with compression M Seo, J Baek, S Lee, SJ Hwang Proceedings of the 63rd Annual Meeting of the Association for Computational …, 2025 | 2 | 2025 |
| The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think S Lee, S Kim, M Seo, Y Jo, D Go, H Hwang, J Park, X Yue, S Welleck, ... arXiv preprint arXiv:2505.10185, 2025 | 1 | 2025 |
| Lost in the Noise: How Reasoning Models Fail with Contextual Distractors S Lee, Y Jo, M Seo, M Lee, M Seo arXiv preprint arXiv:2601.07226, 2026 | | 2026 |