Yash Chandak

Cited by

	All	Since 2021
Citations	1091	1012
h-index	12	12
i10-index	18	17

320

160

240

2019202020212022202320242025202621 54 95 139 172 283 318 5

Public access

View all

12 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Philip S. ThomasUniversity of MassachusettsVerified email at cs.umass.edu
Georgios TheocharousAdobe ResearchVerified email at adobe.com
Emma BrunskillAssociate Professor of Computer Science, Stanford UniversityVerified email at cs.stanford.edu
James KostasPhD Student, University of Massachusetts AmherstVerified email at umass.edu
Martha WhiteUniversity of AlbertaVerified email at ualberta.ca
Scott NiekumAssociate Professor, University of Massachusetts AmherstVerified email at cs.umass.edu
Sridhar MahadevanDirector, Adobe Research & Professor, University of Massachusetts, AmherstVerified email at cs.umass.edu
Bruno Castro da SilvaUniversity of MassachusettsVerified email at cs.umass.edu
Rémi MunosFAIR, MetaVerified email at inria.fr
Will DabneyDeepMindVerified email at google.com
Chris NotaUniversity of Massachusetts, AmherstVerified email at cs.umass.edu
Erik Learned-MillerProfessor of Computer Science, University of Massachusetts AmherstVerified email at cs.umass.edu
Balaraman RavindranProfessor of Data Science and AI, Wadhwani School of Data Science and AI, IIT MadrasVerified email at dsai.iitm.ac.in
Shiv ShankarUMass

Yash Chandak

Postdoctoral Scholar, Stanford University

Verified email at stanford.edu - Homepage

Reinforcement Learning Machine Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Learning action representations for reinforcement learning Y Chandak, G Theocharous, J Kostas, S Jordan, P Thomas International conference on machine learning, 941-950, 2019	254	2019
Supervised pretraining can learn in-context reinforcement learning J Lee, A Xie, A Pacchiano, Y Chandak, C Finn, O Nachum, E Brunskill Advances in Neural Information Processing Systems 36, 43057-43083, 2023	145	2023
Evaluating the performance of reinforcement learning algorithms S Jordan, Y Chandak, D Cohen, M Zhang, P Thomas International Conference on Machine Learning, 4962-4973, 2020	135	2020
Optimizing for the future in non-stationary mdps Y Chandak, G Theocharous, S Shankar, M White, S Mahadevan, ... International Conference on Machine Learning, 1414-1425, 2020	85	2020
Universal off-policy evaluation Y Chandak, S Niekum, B da Silva, E Learned-Miller, E Brunskill, ... Advances in Neural Information Processing Systems 34, 27475-27490, 2021	72	2021
Understanding self-predictive learning for reinforcement learning Y Tang, ZD Guo, PH Richemond, BA Pires, Y Chandak, R Munos, ... International Conference on Machine Learning, 33632-33656, 2023	49	2023
Lifelong learning with a changing action set Y Chandak, G Theocharous, C Nota, P Thomas Proceedings of the AAAI Conference on Artificial Intelligence 34 (04), 3373-3380, 2020	42	2020
The GPT surprise: offering large language model chat in a massive coding class reduced engagement but increased adopters exam performances A Nie, Y Chandak, M Suzara, A Malik, J Woodrow, M Peng, M Sahami, ... arXiv preprint arXiv:2407.09975, 2024	41	2024
Towards safe policy improvement for non-stationary MDPs Y Chandak, S Jordan, G Theocharous, M White, PS Thomas Advances in Neural Information Processing Systems 33, 9156-9168, 2020	39	2020
Behavior alignment via reward function optimization D Gupta, Y Chandak, S Jordan, PS Thomas, B C da Silva Advances in Neural Information Processing Systems 36, 52759-52791, 2023	30	2023
Command a: An enterprise-ready large language model T Cohere, A Ahmadian, M Ahmed, J Alammar, M Alizadeh, Y Alnumay, ... arXiv preprint arXiv:2504.00698, 2025	26	2025
Reinforcement learning for strategic recommendations G Theocharous, Y Chandak, PS Thomas, F de Nijs arXiv preprint arXiv:2009.07346, 2020	13	2020
Contrastive policy gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion Y Flet-Berliac, N Grinsztajn, F Strub, E Choi, B Wu, C Cremer, ... Proceedings of the 2024 Conference on Empirical Methods in Natural Language …, 2024	12	2024
Reinforcement learning when all actions are not always available Y Chandak, G Theocharous, B Metevier, P Thomas Proceedings of the AAAI Conference on Artificial Intelligence 34 (04), 3381-3388, 2020	12	2020
Factored DRO: Factored distributionally robust policies for contextual bandits T Mu, Y Chandak, TB Hashimoto, E Brunskill Advances in Neural Information Processing Systems 35, 8318-8331, 2022	11	2022
High-confidence off-policy (or counterfactual) variance estimation Y Chandak, S Shankar, PS Thomas Proceedings of the AAAI Conference on Artificial Intelligence 35 (8), 6939-6947, 2021	11	2021
Adaptive instrument design for indirect experiments Y Chandak, S Shankar, V Syrgkanis, E Brunskill arXiv preprint arXiv:2312.02438, 2023	10	2023
Fusion graph convolutional networks P Vijayan, Y Chandak, MM Khapra, S Parthasarathy, B Ravindran arXiv preprint arXiv:1805.12528, 2018	10	2018
Representations and exploration for deep reinforcement learning using singular value decomposition Y Chandak, S Thakoor, ZD Guo, Y Tang, R Munos, W Dabney, DL Borsa International Conference on Machine Learning, 4009-4034, 2023	9	2023
Off-policy evaluation for action-dependent non-stationary environments Y Chandak, S Shankar, N Bastian, B da Silva, E Brunskill, PS Thomas Advances in Neural Information Processing Systems 35, 9217-9232, 2022	9	2022

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors