[go: up one dir, main page]

Follow
Qinqing Zheng
Qinqing Zheng
Inception
Verified email at inceptionlabs.ai - Homepage
Title
Cited by
Cited by
Year
Online decision transformer
Q Zheng, A Zhang, A Grover
International Conference on Machine Learning 162, 27042--27059, 2022
3482022
The llama 4 herd: The beginning of a new era of natively multimodal ai innovation
AI Meta
https://ai. meta. com/blog/llama-4-multimodal-intelligence/, checked on 4 (7 …, 2025
2492025
A convergent gradient descent algorithm for rank minimization and semidefinite programming from random linear measurements
Q Zheng, J Lafferty
Advances in Neural Information Processing Systems (NeurIPS), 109--117, 2015
2382015
Convergence analysis for rectangular matrix completion using Burer-Monteiro factorization and gradient descent
Q Zheng, J Lafferty
arXiv preprint arXiv:1605.07051, 2016
2082016
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping
L Lehnert, S Sukhbaatar, DJ Su, Q Zheng, P Mcvay, M Rabbat, Y Tian
COLM 2024, 2024
93*2024
Federated f-differential privacy
Q Zheng, S Chen, Q Long, W Su
AISTATS 2021, 2021
902021
Guided flows for generative modeling and decision making
Q Zheng, M Le, N Shaul, Y Lipman, A Grover, RTQ Chen
arXiv preprint arXiv:2311.13443, 2023
742023
d1: Scaling reasoning in diffusion large language models via reinforcement learning
S Zhao, D Gupta, Q Zheng, A Grover
NeurIPS 2025, 2025
642025
Dual RL: Unification and New Methods for Reinforcement and Imitation Learning
H Sikchi, Q Zheng, A Zhang, S Niekum
ICLR 2024, 2023
63*2023
Diffusion world model: Future Modeling Beyond Step-by-Step Rollout for Offline Reinforcement Learning
Z Ding, A Zhang, Y Tian, Q Zheng
arXiv preprint arXiv:2402.03570, 2024
55*2024
Dualformer: Controllable fast and slow thinking by learning with randomized reasoning traces
DJ Su, S Sukhbaatar, M Rabbat, Y Tian, Q Zheng
ICLR 2025, 2024
442024
Minimax Estimation for Personalized Federated Learning: An Alternative between FedAvg and Local Training?
S Chen, Q Zheng, Q Long, WJ Su
JMLR, 2023
43*2023
Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning
DJ Su, H Zhu, Y Xu, J Jiao, Y Tian, Q Zheng
ICML 2025, 2025
402025
Semi-supervised offline reinforcement learning with action-free trajectories
Q Zheng, M Henaff, B Amos, A Grover
ICML 2023, 2023
322023
Sharp Composition Bounds for Gaussian Differential Privacy via Edgeworth Expansion
Q Zheng, J Dong, Q Long, WJ Su
ICML 2020, 2020
272020
Interpolating convex and non-convex tensor decompositions via the subspace norm
Q Zheng, R Tomioka
NeurIPS 2015, 2015
182015
Latent state marginalization as a low-cost approach for improving exploration
D Zhang, A Courville, Y Bengio, Q Zheng, A Zhang, RTQ Chen
ICLR 2023, 2022
142022
Reliable conditioning of behavioral cloning for offline reinforcement learning
T Nguyen, Q Zheng, A Grover
arXiv preprint arXiv:2210.05158, 2022
11*2022
Near-Optimal Confidence Sequences for Bounded Random Variables
AK Kuchibhotla, Q Zheng
ICML 2021, 2021
112021
Online Intrinsic Rewards for Decision Making Agents from Large Language Model Feedback
Q Zheng, M Henaff, A Zhang, A Grover, B Amos
RLC 2025, 2024
102024
The system can't perform the operation now. Try again later.
Articles 1–20