Lihong Li (李力鸿)

Cited by

	All	Since 2021
Citations	31073	17599
h-index	70	59
i10-index	106	88

3900

1950

975

2925

200820092010201120122013201420152016201720182019202020212022202320242025202696 179 218 353 442 534 600 835 965 1237 1765 2355 3054 3593 3435 3871 3576 3056 65

Public access

View all

15 articles

0 articles

available

not available

Based on funding mandates

Co-authors

John LangfordMicrosoft Research New YorkVerified email at hunch.net
Michael LittmanBrown UniversityVerified email at brown.edu
Jianfeng GaoMicrosoft Research, RedmondVerified email at microsoft.com
Wei Chu（褚崴）InfVerified email at gatsby.ucl.ac.uk
Li DengChief AI Officer, Citadel (former)Verified email at ieee.org
Robert SchapireMicrosoft ResearchVerified email at microsoft.com
Bo DaiGoogle Brain & Georgia TechVerified email at google.com
Denny ZhouResearch Scientist, Google DeepMindVerified email at google.com
Jianshu ChenPrincipal Scientist, AmazonVerified email at ucla.edu
Dale SchuurmansGoogle DeepMind & University of AlbertaVerified email at ualberta.ca
Asli CelikyilmazResearcher @ FAIR at MetaVerified email at ieee.org
Zachary C. LiptonRaj Reddy Associate Professor of Machine Learning @ Carnegie Mellon; Cofounder & CTO @ AbridgeVerified email at cmu.edu
Ji HeUniversity of WashingtonVerified email at uw.edu
Emma BrunskillAssociate Professor of Computer Science, Stanford UniversityVerified email at cs.stanford.edu
Yun-Nung (Vivian) ChenNational Taiwan UniversityVerified email at ieee.org
Thomas J. WalshSony AIVerified email at sony.com
Faisal Ahmed, PhDMicrosoftVerified email at microsoft.com
Miroslav DudikMicrosoft ResearchVerified email at microsoft.com
Chong WangAppleVerified email at cs.princeton.edu
Csaba SzepesvariDeepMind & University of AlbertaVerified email at cs.ualberta.ca

Lihong Li (李力鸿)

AI Research Scientist, Meta

Verified email at meta.com - Homepage

Reinforcement Learning Large Language Models Recommendation Artificial Intelligence


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
A contextual-bandit approach to personalized news article recommendation L Li, W Chu, J Langford, RE Schapire Proceedings of the 19th international conference on World wide web, 661-670, 2010	4119	2010
An empirical evaluation of thompson sampling O Chapelle, L Li Advances in neural information processing systems 24, 2011	2145	2011
Parallelized stochastic gradient descent M Zinkevich, M Weimer, L Li, A Smola Advances in neural information processing systems 23, 2010	1937	2010
Contextual bandits with linear payoff functions W Chu, L Li, L Reyzin, R Schapire Proceedings of the fourteenth international conference on artificial …, 2011	1536	2011
Neural approaches to conversational AI J Gao, M Galley, L Li The 41st international ACM SIGIR conference on research & development in …, 2018	1129	2018
Doubly robust policy evaluation and learning M Dudík, J Langford, L Li arXiv preprint arXiv:1103.4601, 2011	1044	2011
Doubly Robust Policy Evaluation and Learning M Dudık, J Langford, L Li	1044*
Doubly robust off-policy value evaluation for reinforcement learning N Jiang, L Li International conference on machine learning, 652-661, 2016	988	2016
Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms L Li, W Chu, J Langford, X Wang Proceedings of the fourth ACM international conference on Web search and …, 2011	759	2011
Towards a unified theory of state abstraction for MDPs. L Li, TJ Walsh, ML Littman AI&M 1 (2), 3, 2006	741	2006
PAC model-free reinforcement learning AL Strehl, L Li, E Wiewiora, J Langford, ML Littman Proceedings of the 23rd international conference on Machine learning, 881-888, 2006	719	2006
Taming the monster: A fast and simple algorithm for contextual bandits A Agarwal, D Hsu, S Kale, J Langford, L Li, R Schapire International conference on machine learning, 1638-1646, 2014	665	2014
Sparse online learning via truncated gradient J Langford, L Li, T Zhang Advances in neural information processing systems 21, 2008	623	2008
Towards end-to-end reinforcement learning of dialogue agents for information access B Dhingra, L Li, X Li, J Gao, YN Chen, F Ahmad, L Deng Proceedings of the 55th Annual Meeting of the Association for Computational …, 2017	591*	2017
Doubly robust policy evaluation and optimization M Dudík, D Erhan, J Langford, L Li	576	2014
End-to-end task-completion neural dialogue systems X Li, YN Chen, L Li, J Gao, A Celikyilmaz arXiv preprint arXiv:1703.01008, 2017	495	2017
Neuro-symbolic program synthesis E Parisotto, A Mohamed, R Singh, L Li, D Zhou, P Kohli arXiv preprint arXiv:1611.01855, 2016	465	2016
Breaking the curse of horizon: Infinite-horizon off-policy estimation Q Liu, L Li, Z Tang, D Zhou Advances in neural information processing systems 31, 2018	464	2018
Provably optimal algorithms for generalized linear contextual bandits L Li, Y Lu, D Zhou International Conference on Machine Learning, 2071-2080, 2017	447	2017
Dualdice: Behavior-agnostic estimation of discounted stationary distribution corrections O Nachum, Y Chow, B Dai, L Li Advances in neural information processing systems 32, 2019	440	2019

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors