Thomas Mesnard

Cited by

	All	Since 2021
Citations	8474	8055
h-index	19	19
i10-index	21	21

6000

3000

1500

4500

2016201720182019202020212022202320242025202633 71 74 89 122 140 145 231 1969 5340 215

Public access

View all

2 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Yoshua BengioProfessor of computer science, University of Montreal, Mila, IVADO, CIFARVerified email at umontreal.ca
Rémi MunosFAIR, MetaVerified email at inria.fr
Bilal PiotGoogle DeepmindVerified email at google.com
Will DabneyDeepMindVerified email at google.com
Laurent SifreH CompanyVerified email at polytechnique.edu
Doina PrecupDeepMind and McGill UniversityVerified email at cs.mcgill.ca
Theophane WeberResearch Scientist at DeepMindVerified email at google.com
Eric MoulinesAcadémie des Sciences, MBZUAI, EPITAVerified email at mbzuai.ac.ae
Armand JoulinGoogle DeepMindVerified email at google.com
Demis HassabisDeepMind
Jeff DeanGoogle Chief Scientist, Google Research and Google DeepMindVerified email at google.com
koray kavukcuogluDeepMindVerified email at kavukcuoglu.org
Clement FarabetEx Research Scientist, New York UniversityVerified email at nyu.edu
Oriol VinyalsResearch Scientist at Google DeepMindVerified email at google.com
Noah FiedelGoogleVerified email at engineeralum.berkeley.edu

Thomas Mesnard

Research Scientist @ Meta Superintelligence Labs

Verified email at meta.com

LLM Reinforcement Learning Artificial Intelligence


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Gemma: Open models based on gemini research and technology G Team, T Mesnard, C Hardin, R Dadashi, S Bhupatiraju, S Pathak, ... arXiv preprint arXiv:2403.08295, 2024	2449	2024
Gemma 2: Improving open language models at a practical size G Team, M Riviere, S Pathak, PG Sessa, C Hardin, S Bhupatiraju, ... arXiv preprint arXiv:2408.00118, 2024	1930*	2024
Rlaif: Scaling reinforcement learning from human feedback with ai feedback H Lee, S Phatale, H Mansoor, KR Lu, T Mesnard, J Ferret, C Bishop, ...	1056*	2023
Gemma 3 technical report G Team, A Kamath, J Ferret, S Pathak, N Vieillard, R Merhej, S Perrin, ... arXiv preprint arXiv:2503.19786, 2025	952	2025
Towards biologically plausible deep learning Y Bengio, DH Lee, J Bornschein, T Mesnard, Z Lin arXiv preprint arXiv:1502.04156, 2015	524	2015
Direct language model alignment from online ai feedback S Guo, B Zhang, T Liu, T Liu, M Khalman, F Llinares, A Rame, T Mesnard, ... arXiv preprint arXiv:2402.04792, 2024	221	2024
Nash learning from human feedback R Munos, M Valko, D Calandriello, MG Azar, M Rowland, ZD Guo, Y Tang, ... Forty-first International Conference on Machine Learning, 2024	218	2024
An objective function for STDP Y Bengio, T Mesnard, A Fischer, S Zhang, Y Wu arXiv preprint arXiv:1509.05936 5 (6.2), 6.3, 2015	208*	2015
Medgemma technical report A Sellergren, S Kazemzadeh, T Jaroensri, A Kiraly, M Traverse, ... arXiv preprint arXiv:2507.05201, 2025	175	2025
Hindsight credit assignment A Harutyunyan, W Dabney, T Mesnard, M Gheshlaghi Azar, B Piot, ... Advances in neural information processing systems 32, 2019	122	2019
Paligemma 2: A family of versatile vlms for transfer A Steiner, AS Pinto, M Tschannen, D Keysers, X Wang, Y Bitton, ... arXiv preprint arXiv:2412.03555, 2024	115	2024
Counterfactual credit assignment in model-free reinforcement learning T Mesnard, T Weber, F Viola, S Thakoor, A Saade, A Harutyunyan, ... arXiv preprint arXiv:2011.09464, 2020	94	2020
Gemma 2: Improving open language models at a practical size, 2024 M Riviere, S Pathak, PG Sessa, C Hardin, S Bhupatiraju, L Hussenot, ... URL https://arxiv. org/abs/2408.00118, 0	78
others. 2024. Gemma: Open models based on gemini research and technology G Team, T Mesnard, C Hardin, R Dadashi, S Bhupatiraju, S Pathak, ... arXiv preprint arXiv:2403.08295, 1	76	1
Generalization of equilibrium propagation to vector field dynamics B Scellier, A Goyal, J Binas, T Mesnard, Y Bengio arXiv preprint arXiv:1808.04873, 2018	56*	2018
Geometric entropic exploration ZD Guo, MG Azar, A Saade, S Thakoor, B Piot, BA Pires, M Valko, ... arXiv preprint arXiv:2101.02055, 2021	50	2021
A survey of temporal credit assignment in deep reinforcement learning E Pignatelli, J Ferret, M Geist, T Mesnard, H van Hasselt, O Pietquin, ... arXiv preprint arXiv:2312.01072, 2023	46	2023
Towards deep learning with spiking neurons in energy based models with contrastive hebbian plasticity T Mesnard, W Gerstner, J Brea arXiv preprint arXiv:1612.03214, 2016	27	2016
Curiosity in hindsight: Intrinsic exploration in stochastic environments D Jarrett, C Tallec, F Altché, T Mesnard, R Munos, M Valko arXiv preprint arXiv:2211.10515, 2022	26	2022
Recurrentgemma: Moving past transformers for efficient open language models A Botev, S De, SL Smith, A Fernando, GC Muraru, R Haroun, L Berrada, ... arXiv preprint arXiv:2404.07839, 2024	19	2024

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors