[go: up one dir, main page]

Follow
Yu Bai
Yu Bai
OpenAI
Verified email at openai.com - Homepage
Title
Cited by
Cited by
Year
Openai o1 system card
A Jaech, A Kalai, A Lerer, A Richardson, A El-Kishky, A Low, A Helyar, ...
arXiv preprint arXiv:2412.16720, 2024
15182024
The landscape of empirical risk for nonconvex losses
S Mei, Y Bai, A Montanari
The Annals of Statistics 46 (6A), 2747-2774, 2018
4162018
Transformers as statisticians: Provable in-context learning with in-context algorithm selection
Y Bai, F Chen, H Wang, C Xiong, S Mei
Advances in neural information processing systems 36, 57125-57211, 2023
3322023
Negative preference optimization: From catastrophic collapse to effective unlearning
R Zhang, L Lin, Y Bai, S Mei
arXiv preprint arXiv:2404.05868, 2024
3062024
Policy finetuning: Bridging sample-efficient offline and online reinforcement learning
T Xie, N Jiang, H Wang, C Xiong, Y Bai
Advances in neural information processing systems 34, 27395-27407, 2021
2422021
Provable self-play algorithms for competitive reinforcement learning
Y Bai, C Jin
International conference on machine learning, 551-560, 2020
2282020
gpt-oss-120b & gpt-oss-20b model card
S Agarwal, L Ahmad, J Ai, S Altman, A Applebaum, E Arbus, RK Arora, ...
arXiv preprint arXiv:2508.10925, 2025
1912025
Near-Optimal Reinforcement Learning with Self-Play
Y Bai, C Jin, T Yu
Advances in Neural Information Processing Systems, 2020, 2020
1862020
A sharp analysis of model-based reinforcement learning with self-play
Q Liu, T Yu, Y Bai, C Jin
International Conference on Machine Learning, 7001-7010, 2021
1832021
Beyond linearization: On quadratic and higher-order approximation of wide neural networks
Y Bai, JD Lee
International Conference on Learning Representations (ICLR) 2020, 2019
1542019
Proxquant: Quantized neural networks via proximal operators
Y Bai, YX Wang, E Liberty
International Conference on Learning Representations (ICLR) 2019, 2018
1512018
When can we learn general-sum Markov games with a large number of players sample-efficiently?
Z Song, S Mei, Y Bai
International Conference on Learning Representations (ICLR) 2022, 2021
1372021
Provably Efficient Q-Learning with Low Switching Cost
Y Bai, T Xie, N Jiang, YX Wang
Advances in Neural Information Processing Systems, 2019, 2019
1322019
The role of coverage in online reinforcement learning
T Xie, DJ Foster, Y Bai, N Jiang, SM Kakade
arXiv preprint arXiv:2210.04157, 2022
1102022
Near-optimal provable uniform convergence in offline policy evaluation for reinforcement learning
M Yin, Y Bai, YX Wang
International Conference on Artificial Intelligence and Statistics, 1567-1575, 2021
109*2021
How important is the train-validation split in meta-learning?
Y Bai, M Chen, P Zhou, T Zhao, J Lee, S Kakade, H Wang, C Xiong
International Conference on Machine Learning, 543-553, 2021
1032021
Improved online conformal prediction via strongly adaptive online learning
A Bhatnagar, H Wang, C Xiong, Y Bai
International Conference on Machine Learning, 2337-2363, 2023
1002023
Approximability of discriminators implies diversity in GANs
Y Bai, T Ma, A Risteski
International Conference on Learning Representations (ICLR) 2019, 2018
972018
Sample-efficient learning of stackelberg equilibria in general-sum games
Y Bai, C Jin, H Wang, C Xiong
Advances in Neural Information Processing Systems 34, 25799-25811, 2021
942021
How do transformers learn in-context beyond simple functions? a case study on learning with representations
T Guo, W Hu, S Mei, H Wang, C Xiong, S Savarese, Y Bai
arXiv preprint arXiv:2310.10616, 2023
882023
The system can't perform the operation now. Try again later.
Articles 1–20