[go: up one dir, main page]

Follow
Yubo Ma
Title
Cited by
Cited by
Year
VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models
H Duan, J Yang, Y Qiao, X Fang, L Chen, Y Liu, A Agarwal, Z Chen, M Li, ...
arXiv preprint arXiv:2407.11691, 2024
3362024
Large language model is not a good few-shot information extractor, but a good reranker for hard samples!
Y Ma, Y Cao, YC Hong, A Sun
EMNLP 2023 (Findings), 2023
2452023
Prompt for Extraction? PAIE: Prompting Argument Interaction for Event Argument Extraction
Y Ma, Z Wang, Y Cao, M Li, M Chen, K Wang, J Shao
ACL 2022, 2022
2042022
MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations
Y Ma, Y Zang, L Chen, M Chen, Y Jiao, X Li, X Lu, Z Liu, Y Ma, X Dong, ...
NeurIPS 2024 spotlight (Dataset and Benchmark), 2024
1032024
SciAgent: Tool-augmented Language Models for Scientific Reasoning
Y Ma, Z Gou, J Hao, R Xu, S Wang, L Pan, Y Yang, Y Cao, A Sun, ...
EMNLP 2024, 2024
742024
Towards verifiable generation: A benchmark for knowledge-aware language model attribution
X Li, Y Cao, L Pan, Y Ma, A Sun
ACL 2024 (Findings), 2023
332023
InternLM-XComposer2. 5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Y Zang, X Dong, P Zhang, Y Cao, Z Liu, S Ding, S Wu, Y Ma, H Duan, ...
ACL 2025 (Findings), 2025
322025
Learning to teach large language models logical reasoning
M Chen, Y Ma, K Song, Y Cao, Y Zhang, D Li
ACL 2024, 2023
32*2023
Toward generalizable evaluation in the llm era: A survey beyond benchmarks
Y Cao, S Hong, X Li, J Ying, Y Ma, H Liang, Y Liu, Z Yao, X Wang, ...
arXiv preprint arXiv:2504.18838, 2025
252025
Antileak-bench: Preventing data contamination by automatically constructing benchmarks with updated real-world knowledge
X Wu, L Pan, Y Xie, R Zhou, S Zhao, Y Ma, M Du, R Mao, AT Luu, ...
ACL 2025, 2024
242024
Long context vs. rag for llms: An evaluation and revisits
X Li, Y Cao, Y Ma, A Sun
arXiv preprint arXiv:2501.01880, 2024
222024
MMEKG: Multi-modal Event Knowledge Graph towards universal representation across modalities
Y Ma, Z Wang, M Li, Y Cao, M Chen, X Li, W Sun, K Deng, K Wang, A Sun, ...
ACL 2022 (System Demonstration Track), 2022
212022
Information extraction in low-resource scenarios: Survey and perspective
S Deng, Y Ma, N Zhang, Y Cao, B Hooi
2024 IEEE International Conference on Knowledge Graph (ICKG), 33-49, 2024
202024
Few-shot Event Detection: An Empirical Study and a Unified View
Y Ma, Z Wang, Y Cao, A Sun
ACL 2023, 2023
202023
Tart: An open-source tool-augmented framework for explainable table-based reasoning
X Lu, L Pan, Y Ma, P Nakov, MY Kan
Findings of the Association for Computational Linguistics: NAACL 2025, 4323-4339, 2025
112025
Navigating the nuances: A fine-grained evaluation of vision-language navigation
Z Wang, M Wu, Y Cao, Y Ma, M Chen, T Tuytelaars
EMNLP 2024 (Findings), 2024
72024
MTR-Bench: A Comprehensive Benchmark for Multi-Turn Reasoning Evaluation
X Li, K Bao, Y Ma, M Li, W Wang, R Men, Y Zhang, F Feng, D Liu, J Lin
arXiv preprint arXiv:2505.17123, 2025
62025
Towards Storage-Efficient Visual Document Retrieval: An Empirical Study on Reducing Patch-Level Embeddings
Y Ma, J Li, Y Zang, X Wu, X Dong, P Zhang, Y Cao, H Duan, J Wang, ...
ACL 2025 (Findings), 2025
32025
Effieval: Efficient and generalizable model evaluation via capability coverage maximization
Y Wang, J Ying, Y Cao, Y Ma, Y Jiang
arXiv preprint arXiv:2508.09662, 2025
22025
Synergistic Weak-Strong Collaboration by Aligning Preferences
Y Jiao, X Zhang, Z Wang, Y Ma, Z Deng, R Wang, C Bansal, S Rajmohan, ...
ACL 2025, 2025
2025
The system can't perform the operation now. Try again later.
Articles 1–20