Pengcheng He

Cited by

	All	Since 2021
Citations	20052	19062
h-index	41	40
i10-index	57	56

7000

3500

1750

5250

20192020202120222023202420252026207 711 1271 1898 3464 5604 6599 202

Public access

View all

4 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Weizhu ChenMicrosoft, Technical FellowVerified email at microsoft.com
Jianfeng GaoMicrosoft Research, RedmondVerified email at microsoft.com
Xiaodong LiuMicrosoft Research, RedmondVerified email at microsoft.com
Tuo ZhaoAssociate Professor, Georgia TechVerified email at gatech.edu
Baolin PengMicrosoft Research, RedmondVerified email at microsoft.com
Haoming JiangOpenAI; Ex-Amazon; Georgia Institute of TechnologyVerified email at gatech.edu
Hao ChengMicrosoft Research / University of WashingtonVerified email at microsoft.com
Jiawei HanAbel Bliss Professor of Computer Science, University of IllinoisVerified email at cs.uiuc.edu
Liyuan LiuThinking Machines LabVerified email at illinois.edu
Hoifung PoonGeneral Manager, Microsoft ResearchVerified email at microsoft.com
Adam TrischlerMicrosoft Research, McGill UniversityVerified email at microsoft.com
Tao ShenOracleVerified email at oracle.com
Guodong LongAssociate Professor, Faculty of Engineering and IT, University of Technology SydneyVerified email at uts.edu.au
William DarlingCohereVerified email at cohere.com
Yu WangMicrosoft ResearchVerified email at microsoft.com

Pengcheng He

Microsoft

Verified email at microsoft.com

Machine Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Deberta: Decoding-enhanced bert with disentangled attention P He, X Liu, J Gao, W Chen ICLR 2021, 2020	4380	2020
On the variance of the adaptive learning rate and beyond L Liu, H Jiang, P He, W Chen, X Liu, J Gao, J Han ICLR 2019, 2019	2907	2019
Debertav3: Improving deberta using electra-style pre-training with gradient-disentangled embedding sharing P He, J Gao, W Chen ICLR 2023, 2021	1730	2021
Multi-task deep neural networks for natural language understanding X Liu, P He, W Chen, J Gao ACL 2019, 2019	1654	2019
Instruction tuning with gpt-4 B Peng, C Li, P He, M Galley, J Gao arXiv preprint arXiv:2304.03277, 2023	1292	2023
Adalora: Adaptive budget allocation for parameter-efficient fine-tuning Q Zhang, M Chen, A Bukharin, N Karampatziakis, P He, Y Cheng, ... arXiv preprint arXiv:2303.10512, 2023	1068	2023
Query rewriting in retrieval-augmented large language models X Ma, Y Gong, P He, H Zhao, N Duan Proceedings of the 2023 Conference on Empirical Methods in Natural Language …, 2023	640	2023
Check your facts and try again: Improving large language models with external knowledge and automated feedback B Peng, M Galley, P He, H Cheng, Y Xie, Y Hu, Q Huang, L Liden, Z Yu, ... arXiv preprint arXiv:2302.12813, 2023	632	2023
Smart: Robust and efficient fine-tuning for pre-trained natural language models through principled regularized optimization H Jiang, P He, W Chen, X Liu, J Gao, T Zhao ACL 2020, 2019	584	2019
Dola: Decoding by contrasting layers improves factuality in large language models YS Chuang, Y Xie, H Luo, Y Kim, J Glass, P He arXiv preprint arXiv:2309.03883, 2023	555	2023
Patch diffusion: Faster and more data-efficient training of diffusion models Z Wang, Y Jiang, H Zheng, P Wang, P He, Z Wang, W Chen, M Zhou Advances in neural information processing systems 36, 72137-72154, 2023	404	2023
Diffusion-GAN: Training GANs with Diffusion Z Wang, H Zheng, P He, W Chen, M Zhou ICLR 2023, 2022	397	2022
Generation-augmented retrieval for open-domain question answering Y Mao, P He, X Liu, Y Shen, J Gao, J Han, W Chen Proceedings of the 59th Annual Meeting of the Association for Computational …, 2021	348	2021
Loftq: Lora-fine-tuning-aware quantization for large language models Y Li, Y Yu, C Liang, P He, N Karampatziakis, W Chen, T Zhao arXiv preprint arXiv:2310.08659, 2023	284	2023
Adversarial training for large neural language models X Liu, H Cheng, P He, W Chen, Y Wang, H Poon, J Gao arXiv preprint arXiv:2004.08994, 2020	255	2020
Improving multi-task deep neural networks via knowledge distillation for natural language understanding X Liu, P He, W Chen, J Gao arXiv preprint arXiv:1904.09482, 2019	248	2019
Chain of draft: Thinking faster by writing less S Xu, W Xie, L Zhao, P He arXiv preprint arXiv:2502.18600, 2025	174	2025
Guiding large language models via directional stimulus prompting Z Li, B Peng, P He, M Galley, J Gao, X Yan Advances in Neural Information Processing Systems 36, 62630-62656, 2023	164	2023
Truncated diffusion probabilistic models and diffusion-based adversarial auto-encoders H Zheng, P He, W Chen, M Zhou ICLR 2023, 2022	149*	2022
Losparse: Structured compression of large language models based on low-rank and sparse approximation Y Li, Y Yu, Q Zhang, C Liang, P He, W Chen, T Zhao International Conference on Machine Learning, 20336-20350, 2023	144	2023

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors