| Systematic generalization: what is required and can it be learned? D Bahdanau, S Murty, M Noukhovitch, TH Nguyen, H de Vries, A Courville arXiv preprint arXiv:1811.12889, 2018 | 278 | 2018 |
| Pretraining representations for data-efficient reinforcement learning M Schwarzer, N Rajkumar, M Noukhovitch, A Anand, L Charlin, RD Hjelm, ... Advances in Neural Information Processing Systems 34, 12686-12699, 2021 | 193 | 2021 |
| The n+ implementation details of rlhf with ppo: A case study on tl; dr summarization S Huang, M Noukhovitch, A Hosseini, K Rasul, W Wang, L Tunstall arXiv preprint arXiv:2403.17031, 2024 | 60 | 2024 |
| Emergent communication under competition M Noukhovitch, T LaCroix, A Lazaridou, A Courville arXiv preprint arXiv:2101.10276, 2021 | 45 | 2021 |
| Asynchronous RLHF: Faster and more efficient off-policy rl for language models M Noukhovitch, S Huang, S Xhonneux, A Hosseini, R Agarwal, ... arXiv preprint arXiv:2410.18252, 2024 | 41* | 2024 |
| Language model alignment with elastic reset M Noukhovitch, S Lavoie, F Strub, AC Courville Advances in Neural Information Processing Systems 36, 3439-3461, 2023 | 34 | 2023 |
| Commonsense mining as knowledge base completion? a study on the impact of novelty S Jastrzębski, D Bahdanau, S Hosseini, M Noukhovitch, Y Bengio, ... Proceedings of the Workshop on Generalization in the Age of Deep Learning, 8-16, 2018 | 32 | 2018 |
| Simplicial embeddings in self-supervised learning and downstream classification S Lavoie, C Tsirigotis, M Schwarzer, A Vani, M Noukhovitch, K Kawaguchi, ... arXiv preprint arXiv:2204.00616, 2022 | 31 | 2022 |
| Learning multi-agent communication with contrastive learning YL Lo, B Sengupta, J Foerster, M Noukhovitch arXiv preprint arXiv:2307.01403, 2023 | 19 | 2023 |
| Oríon: Experiment version control for efficient hyperparameter optimization C Tsirigotis, X Bouthillier, F Corneau-Tremblay, P Henderson, R Askari, ... | 14* | 2018 |
| Epistimio/orion: Asynchronous distributed hyperparameter optimization X Bouthillier, C Tsirigotis, F Corneau-Tremblay, T Schweizer, L Dong, ... Nov, 2021 | 8 | 2021 |
| Olmo 3 T Olmo, A Ettinger, A Bertsch, B Kuehl, D Graham, D Heineman, ... arXiv preprint arXiv:2512.13961, 2025 | 1 | 2025 |
| Learning Robust Social Strategies with Large Language Models D Piche, M Muqeeth, M Aghajohari, J Duque, M Noukhovitch, A Courville arXiv preprint arXiv:2511.19405, 2025 | | 2025 |
| Compositional Discrete Latent Code for High Fidelity, Productive Diffusion Models S Lavoie, M Noukhovitch, A Courville arXiv preprint arXiv:2507.12318, 2025 | | 2025 |
| Emergence of Communication with Selfish Agents M Noukhovitch, T Lacroix, A Lazaridou, A Courville Evolang, 314, 2020 | | 2020 |
| In-Context Learning, Can It Break Safety? S Xhonneux, D Dobre, M Noukhovitch, J Tang, G Gidel, D Sridhar ICML 2024 Next Generation of AI Safety Workshop, 0 | | |
| Countering Language Drift with KL Regularization M Noukhovitch, S Lavoie, IH Laradji, D Kiela, F Strub, A Courville | | |
| Selfish Emergent Communication M Noukhovitch, T LaCroix, A Courville | | |