A. Rupam Mahmood

Cited by

	All	Since 2021
Citations	2509	1920
h-index	24	21
i10-index	38	35

640

320

160

480

201320142015201620172018201920202021202220232024202520267 20 34 57 58 90 130 178 237 249 288 502 636 5

Public access

View all

11 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Richard S. SuttonKeen, Amii, and University of AlbertaVerified email at richsutton.com
Gautham VasanUniversity of Alberta, AmiiVerified email at ualberta.ca
Qingfeng LanQwen Team, Alibaba GroupVerified email at ualberta.ca
Martha WhiteUniversity of AlbertaVerified email at ualberta.ca
James BergstraPrincipal Engineer, Ocado TechnologyVerified email at ocado.com
Dmytro KorenkevychMeta AIVerified email at meta.com
Shibhansh DoharePhD Candidate, University of AlbertaVerified email at ualberta.ca
Hado van HasseltResearch Scientist, DeepMind; Honorary Professor, UCLVerified email at google.com
Patrick M. PilarskiProfessor, University of Alberta, Amii (Alberta Machine Intelligence Institute)Verified email at ualberta.ca
Doina PrecupDeepMind and McGill UniversityVerified email at cs.mcgill.ca
Harm van SeijenSony AIVerified email at sony.com
Brent KomerPhD Student, University of WaterlooVerified email at uwaterloo.ca
Fengdi CheUniversity of AlbertaVerified email at ualberta.ca
Marlos C. MachadoUniversity of Alberta | Amii | Canada CIFAR AI ChairVerified email at ualberta.ca
Thomas DegrisDeepMindVerified email at google.com
Bryan ChanUniversity of AlbertaVerified email at ualberta.ca
Oliver LimoyoUniversity of Toronto Institute for Aerospace StudiesVerified email at mail.utoronto.ca
Jonathan KellyProfessor, Institute for Aerospace Studies, University of TorontoVerified email at robotics.utias.utoronto.ca
Mohamed ElsayedPhD student @ University of AlbertaVerified email at ualberta.ca

A. Rupam Mahmood

University of Alberta, Amii

Verified email at ualberta.ca - Homepage

Continual learning reinforcement learning robot learning representation learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
An emphatic approach to the problem of off-policy temporal-difference learning RS Sutton, AR Mahmood, M White (JMLR) Journal of Machine Learning Research 17, 2016	366	2016
Benchmarking reinforcement learning algorithms on real-world robots AR Mahmood, D Korenkevych, G Vasan, W Ma, J Bergstra (CoRL) Proceedings of the 2nd Annual Conference on Robot Learning, 2018	268	2018
Loss of plasticity in deep continual learning S Dohare, JF Hernandez-Garcia, Q Lan, P Rahman, AR Mahmood, ... Nature 632 (8026), 768-774, 2024	253	2024
Weighted importance sampling for off-policy learning with linear function approximation AR Mahmood, H van Hasselt, RS Sutton (NeurIPS) Advances in Neural Information Processing Systems 27, 2014	210	2014
True online temporal-difference learning H van Seijen, AR Mahmood, PM Pilarski, MC Machado, RS Sutton (JMLR) Journal of Machine Learning Research 17, 2016	135	2016
Setting up a reinforcement learning task with a real-world robot AR Mahmood, D Korenkevych, BJ Komer, J Bergstra (IROS) 2018 IEEE/RSJ International Conference on Intelligent Robots and …, 2018	120	2018
Continual backprop: Stochastic gradient descent with persistent randomness S Dohare, RS Sutton, AR Mahmood arXiv preprint arXiv:2108.06325, 2021	107	2021
Tuning-free step-size adaptation AR Mahmood, RS Sutton, T Degris, PM Pilarski (ICASSP) Acoustics, Speech and Signal Processing, 2012 IEEE International …, 2012	99	2012
Addressing Loss of Plasticity and Catastrophic Forgetting in Continual Learning M Elsayed, AR Mahmood (ICLR) The Twelfth International Conference on Learning Representations, 2024	64*	2024
Multi-step off-policy learning without importance sampling ratios AR Mahmood, H Yu, RS Sutton arXiv preprint arXiv:1702.03006, 2017	58	2017
Greedification operators for policy optimization: investigating forward and reverse KL divergences A Chan, H Silva, S Lim, T Kozuno, AR Mahmood, M White (JMLR) Journal of Machine Learning Research, 2022	53	2022
Representation Search through Generate and Test AR Mahmood, RS Sutton Workshops at the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013	52	2013
Off-policy TD (λ) with a true online equivalence H van Hasselt, AR Mahmood, RS Sutton (UAI) Proceedings of the 30th Conference on Uncertainty in Artificial …, 2014	49	2014
On generalized Bellman equations and temporal-difference learning H Yu, AR Mahmood, RS Sutton (JMLR) The Journal of Machine Learning Research 19 (1), 1864-1912, 2018	48	2018
A new Q (λ) with interim forward view and Monte Carlo equivalence RS Sutton, AR Mahmood, D Precup, M CA, H van Hasselt, U CA (ICML) In International Conference on Machine Learning, 2014	45	2014
Maintaining plasticity in deep continual learning S Dohare, JF Hernandez-Garcia, P Rahman, AR Mahmood, RS Sutton arXiv preprint arXiv:2306.13812, 2023	43	2023
Emphatic temporal-difference learning AR Mahmood, H Yu, M White, RS Sutton In European Workshops on Reinforcement Learning, 2015	41	2015
Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo H Ishfaq, Q Lan, P Xu, AR Mahmood, D Precup, A Anandkumar, ... (ICLR) International Conference on Learning Representations, 2024	37	2024
Autoregressive policies for continuous control deep reinforcement learning D Korenkevych, AR Mahmood, G Vasan, J Bergstra (IJCAI) Proceedings of the 28th International Joint Conference on Artificial …, 2019	35	2019
Off-policy learning based on weighted importance sampling with linear computational complexity AR Mahmood, RS Sutton (UAI) Proceedings of the 31st Conference on Uncertainty in Artificial …, 2015	33	2015

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors