Damai Dai

Cited by

	All	Since 2021
Citations	21768	21676
h-index	25	25
i10-index	39	39

18000

9000

4500

13500

20212022202320242025202658 126 712 2862 17240 658

Public access

View all

16 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Zhifang SUIPeking UniversityVerified email at pku.edu.cn
Zhihong ShaoTsinghua UniversityVerified email at mails.tsinghua.edu.cn
Baobao CHANGPeking UniversityVerified email at pku.edu.cn
Qingxiu DongPeking UniversityVerified email at stu.pku.edu.cn
Xu SunPeking UniversityVerified email at pku.edu.cn
Runxin XuDeepSeek AI | Peking UniversityVerified email at stu.pku.edu.cn
Peiyi WangDeepSeek AIVerified email at stu.pku.edu.cn
Yu Wu (吴俣)DeepSeek AIVerified email at deepseek.com
Heming XiaNatural Language Processing Group, The Hong Kong Polytechnic UniversityVerified email at connect.polyu.hk
Li DongMicrosoft ResearchVerified email at microsoft.com
Deli ChenDeepSeek AIVerified email at deepseek.com
Chengqi DengZhejiang UniversityVerified email at zju.edu.cn
Furu WeiChief Scientist of Microsoft Research Asia, Distinguished Scientist at MicrosoftVerified email at microsoft.com
Jingjing XuShanghai AI LabVerified email at pku.edu.cn
wangding zengDeepSeek AIVerified email at deepseek.com
Tianyu LiuPeking UniversityVerified email at pku.edu.cn
Shuming MaMicrosoft Research AsiaVerified email at microsoft.com
Chenggang ZhaoDeepSeek AIVerified email at deepseek.com
Yutao SunTsinghua UniversityVerified email at mails.tsinghua.edu.cn
Yaru HaoMicrosoft ResearchVerified email at microsoft.com

Damai Dai

Other names代达劢

Peking University, DeepSeek AI

Verified email at pku.edu.cn

Deep Learning Natural Language Processing Large Language Model Mixture-of-Experts


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning D Guo, D Yang, H Zhang, J Song, R Zhang, R Xu, Q Zhu, S Ma, P Wang, ... arXiv preprint arXiv:2501.12948, 2025	7507*	2025
Deepseek-v3 technical report A Liu, B Feng, B Xue, B Wang, B Wu, C Lu, C Zhao, C Deng, C Zhang, ... arXiv preprint arXiv:2412.19437, 2024	3676	2024
A survey on in-context learning Q Dong, L Li, D Dai, C Zheng, J Ma, R Li, H Xia, J Xu, Z Wu, B Chang, ... EMNLP 2024, 1107-1128, 2024	3071	2024
Knowledge neurons in pretrained transformers D Dai, L Dong, Y Hao, Z Sui, C Baobao, F Wei ACL 2022, 8493-8502, 2022	888	2022
Deepseek llm: Scaling open-source language models with longtermism X Bi, D Chen, G Chen, S Chen, D Dai, C Deng, H Ding, K Dong, Q Du, ... arXiv preprint arXiv:2401.02954, 2024	784	2024
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model DeepSeek-AI, A Liu, B Feng, B Wang, B Wang, B Liu, C Zhao, C Dengr, ... arXiv preprint arXiv:2405.04434, 2024	766*	2024
Math-shepherd: Verify and reinforce llms step-by-step without human annotations P Wang, L Li, Z Shao, RX Xu, D Dai, Y Li, D Chen, Y Wu, Z Sui ACL 2024, 2024	756*	2024
Deepseekmoe: Towards ultimate expert specialization in mixture-of-experts language models D Dai, C Deng, C Zhao, RX Xu, H Gao, D Chen, J Li, W Zeng, X Yu, Y Wu, ... ACL 2024, 2024	735	2024
Why can gpt learn in-context? language models implicitly perform gradient descent as meta-optimizers D Dai, Y Sun, L Dong, Y Hao, S Ma, Z Sui, F Wei ACL 2023 Findings, 2023	613*	2023
Deepseek-vl2: Mixture-of-experts vision-language models for advanced multimodal understanding Z Wu, X Chen, Z Pan, X Liu, W Liu, D Dai, H Gao, Y Ma, C Wu, B Wang, ... arXiv preprint arXiv:2412.10302, 2024	515*	2024
Deepseek-coder-v2: Breaking the barrier of closed-source models in code intelligence Q Zhu, D Guo, Z Shao, D Yang, P Wang, R Xu, Y Wu, Y Li, H Gao, S Ma, ... arXiv preprint arXiv:2406.11931, 2024	439*	2024
DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning D Guo, D Yang, H Zhang, J Song, P Wang, Q Zhu, R Xu, R Zhang, S Ma, ... Nature 645 (8081), 633-638, 2025	367	2025
Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning L Wang, L Li, D Dai, D Chen, H Zhou, F Meng, J Zhou, X Sun EMNLP 2023 Best Long Paper, 2023	252	2023
Native sparse attention: Hardware-aligned and natively trainable sparse attention J Yuan, H Gao, D Dai, J Luo, L Zhao, Z Zhang, Z Xie, YX Wei, L Wang, ... ACL 2025 Best Paper, 2025	219*	2025
Calibrating Factual Knowledge in Pretrained Language Models Q Dong, D Dai, Y Song, J Xu, Z Sui, L Li EMNLP 2022 Findings, 2022	158	2022
On the representation collapse of sparse mixture of experts Z Chi, L Dong, S Huang, D Dai, S Ma, B Patra, S Singhal, P Bajaj, X Song, ... NeurIPS 2022 35, 34600-34613, 2022	154	2022
StableMoE: Stable Routing Strategy for Mixture of Experts D Dai, L Dong, S Ma, B Zheng, Z Sui, B Chang, F Wei ACL 2022, 2022	114	2022
Preliminary study on the construction of Chinese medical knowledge graph O Byambasuren, Y Yang, Z Sui, D Dai, B Chang, S Li, H Zan Journal of Chinese Information Processing 33 (10), 1-9, 2019	112*	2019
Auxiliary-loss-free load balancing strategy for mixture-of-experts L Wang, H Gao, C Zhao, X Sun, D Dai arXiv preprint arXiv:2408.15664, 2024	92*	2024
Livebot: Generating live video comments based on visual and textual contexts S Ma, L Cui, D Dai, F Wei, X Sun AAAI 2019 33 (01), 6810-6817, 2019	77	2019

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors