[go: up one dir, main page]

Follow
Haipeng Luo
Haipeng Luo
Verified email at usc.edu - Homepage
Title
Cited by
Cited by
Year
Wizardmath: Empowering mathematical reasoning for large language models via reinforced evol-instruct
H Luo, Q Sun, C Xu, P Zhao, J Lou, C Tao, X Geng, Q Lin, S Chen, ...
arXiv preprint arXiv:2308.09583, 2023
6362023
Fast convergence of regularized learning in games
V Syrgkanis, A Agarwal, H Luo, RE Schapire
Advances in Neural Information Processing Systems 28, 2015
3582015
Adaptive resource provisioning for the cloud using online bin packing
W Song, Z Xiao, Q Chen, H Luo
IEEE transactions on Computers 63 (11), 2647-2660, 2013
3092013
Corralling a band of bandit algorithms
A Agarwal, H Luo, B Neyshabur, RE Schapire
Conference on Learning Theory, 12-38, 2017
2262017
More adaptive algorithms for adversarial bandits
CY Wei, H Luo
Conference On Learning Theory, 1263-1291, 2018
2212018
Variance-reduced and projection-free stochastic optimization
E Hazan, H Luo
International Conference on Machine Learning, 1263-1271, 2016
2132016
Learning adversarial markov decision processes with bandit feedback and unknown transition
C Jin, T Jin, H Luo, S Sra, T Yu
International Conference on Machine Learning, 4860-4869, 2020
204*2020
Achieving all with no parameters: Adanormalhedge
H Luo, RE Schapire
Conference on Learning Theory, 1286-1304, 2015
1842015
Practical contextual bandits with regression oracles
D Foster, A Agarwal, M Dudík, H Luo, R Schapire
International Conference on Machine Learning, 1539-1548, 2018
1802018
Linear last-iterate convergence in constrained saddle-point optimization
CY Wei, CW Lee, M Zhang, H Luo
arXiv preprint arXiv:2006.09517, 2020
1732020
A new algorithm for non-stationary contextual bandits: Efficient, optimal and parameter-free
Y Chen, CW Lee, H Luo, CY Wei
Conference on Learning Theory, 696-726, 2019
1652019
Non-stationary reinforcement learning without prior knowledge: An optimal black-box approach
CY Wei, H Luo
Conference on learning theory, 4300-4354, 2021
1622021
Efficient Contextual Bandits in Non-stationary Worlds
H Luo, CY Wei, A Agarwal, J Langford
arXiv preprint arXiv:1708.01799, 2017
1602017
Model-free reinforcement learning in infinite-horizon average-reward markov decision processes
CY Wei, MJ Jahromi, H Luo, H Sharma, R Jain
International conference on machine learning, 10170-10180, 2020
1472020
Last-iterate convergence of decentralized optimistic gradient descent/ascent in infinite-horizon competitive Markov games
CY Wei, CW Lee, M Zhang, H Luo
Conference on learning theory, 4259-4299, 2021
1312021
Model selection for contextual bandits
DJ Foster, A Krishnamurthy, H Luo
Advances in Neural Information Processing Systems 32, 2019
1292019
Efficient second order online learning by sketching
H Luo, A Agarwal, N Cesa-Bianchi, J Langford
Advances in Neural Information Processing Systems 29, 2016
1242016
Logistic regression: The importance of being improper
DJ Foster, S Kale, H Luo, M Mohri, K Sridharan
Conference on learning theory, 167-208, 2018
1212018
Beating stochastic and adversarial semi-bandits optimally and simultaneously
J Zimmert, H Luo, CY Wei
International Conference on Machine Learning, 7683-7692, 2019
1122019
Optimal and adaptive algorithms for online boosting
A Beygelzimer, S Kale, H Luo
International Conference on Machine Learning, 2323-2331, 2015
1042015
The system can't perform the operation now. Try again later.
Articles 1–20