[go: up one dir, main page]

Follow
Cunxiang Wang
Cunxiang Wang
Verified email at tsinghua.edu.cn - Homepage
Title
Cited by
Cited by
Year
A survey on evaluation of large language models
Y Chang, X Wang, J Wang, Y Wu, L Yang, K Zhu, H Chen, X Yi, C Wang, ...
ACM transactions on intelligent systems and technology 15 (3), 1-45, 2024
50232024
Pandalm: An automatic evaluation benchmark for llm instruction tuning optimization
Y Wang, Z Yu, Z Zeng, L Yang, C Wang, H Chen, C Jiang, R Xie, J Wang, ...
ICLR 2024, 2023
3512023
Survey on factuality in large language models
C Wang, X Liu, Y Yue, Q Guo, X Hu, X Tang, T Zhang, C Jiayang, Y Yao, ...
ACM Computing Surveys 58 (1), 1-37, 2025
343*2025
Knowledge conflicts for llms: A survey
R Xu, Z Qi, Z Guo, C Wang, H Wang, Y Zhang, W Xu
EMNLP2024, 2024
2212024
Does It Make Sense? And Why? A Pilot Study for Sense Making and Explanation
C Wang, S Liang, Y Zhang, X Li, T Gao
ACL 2019, 4020–4026, 2019
1222019
SemEval-2020 task 4: Commonsense validation and explanation
C Wang, S Liang, Y Jin, Y Wang, X Zhu, Y Zhang
SemEval-2020 Task track, 2020
1212020
Evaluating Open-QA Evaluation
C Wang, S Cheng, Q Guo, Y Yue, B Ding, Z Xu, Y Wang, X Hu, Z Zhang, ...
Advances in Neural Information Processing Systems 36, 2023
1132023
Glm-4.5: Agentic, reasoning, and coding (arc) foundation models
A Zeng, X Lv, Q Zheng, Z Hou, B Chen, C Xie, C Wang, D Yin, H Zeng, ...
arXiv preprint arXiv:2508.06471, 2025
1122025
Can generative pre-trained language models serve as knowledge bases for closed-book qa?
C Wang, P Liu, Y Zhang
ACL 2021, 2021
1002021
Ragchecker: A fine-grained framework for diagnosing retrieval-augmented generation
D Ru, L Qiu, X Hu, T Zhang, P Shi, S Chang, C Jiayang, C Wang, S Sun, ...
Advances in Neural Information Processing Systems 37, 21999-22027, 2024
972024
Llms with chain-of-thought are non-causal reasoners
G Bao, H Zhang, L Yang, C Wang, Y Zhang
CoRR, 2024
472024
A survey on evaluation of large language models. arXiv
Y Chang, X Wang, J Wang, Y Wu, L Yang, K Zhu, H Chen, X Yi, C Wang, ...
Preprint posted online on Dec 29, 2023
442023
NovelQA: Benchmarking question answering on documents exceeding 200k tokens
C Wang, R Ning, B Pan, T Wu, Q Guo, C Deng, G Bao, X Hu, Z Zhang, ...
ICLR2025, 2024
42*2024
A survey on evaluation of large language models (2023)
Y Chang, X Wang, J Wang, Y Wu, L Yang, K Zhu, H Chen, X Yi, C Wang, ...
42*
Shield: Evaluation and defense strategies for copyright compliance in llm text generation
X Liu, T Sun, T Xu, F Wu, C Wang, X Wang, J Gao
EMNLP2024, 2024
372024
Self-dc: When to retrieve and when to generate? self divide-and-conquer for compositional unknown questions
H Wang, B Xue, B Zhou, T Zhang, C Wang, G Chen, H Wang, K Wong
CoRR, 2024
35*2024
Spar: Self-play with tree-search refinement to improve instruction-following in large language models
J Cheng, X Liu, C Wang, X Gu, Y Lu, D Zhang, Y Dong, J Tang, H Wang, ...
ICLR, 2025
272025
LongRAG: Evaluating Long-Context & Long-Form Retrieval-Augmented Generation with Key Point Recall
Z Qi, R Xu, Z Guo, C Wang, H Zhang, W Xu
arXiv preprint arXiv:2410.23000, 2024
262024
Exploring generalization ability of pretrained language models on arithmetic and logical reasoning
C Wang, B Zheng, Y Niu, Y Zhang
CCF International Conference on Natural Language Processing and Chinese …, 2021
262021
RFiD: Towards Rational Fusion-in-Decoder for Open-Domain Question Answering
C Wang, H Yu, Y Zhang
Findings of the Association for Computational Linguistics: ACL 2023, 2023
242023
The system can't perform the operation now. Try again later.
Articles 1–20