[go: up one dir, main page]

Follow
Michael Noukhovitch
Title
Cited by
Cited by
Year
Systematic generalization: what is required and can it be learned?
D Bahdanau, S Murty, M Noukhovitch, TH Nguyen, H de Vries, A Courville
arXiv preprint arXiv:1811.12889, 2018
2782018
Pretraining representations for data-efficient reinforcement learning
M Schwarzer, N Rajkumar, M Noukhovitch, A Anand, L Charlin, RD Hjelm, ...
Advances in Neural Information Processing Systems 34, 12686-12699, 2021
1932021
The n+ implementation details of rlhf with ppo: A case study on tl; dr summarization
S Huang, M Noukhovitch, A Hosseini, K Rasul, W Wang, L Tunstall
arXiv preprint arXiv:2403.17031, 2024
602024
Emergent communication under competition
M Noukhovitch, T LaCroix, A Lazaridou, A Courville
arXiv preprint arXiv:2101.10276, 2021
452021
Asynchronous RLHF: Faster and more efficient off-policy rl for language models
M Noukhovitch, S Huang, S Xhonneux, A Hosseini, R Agarwal, ...
arXiv preprint arXiv:2410.18252, 2024
41*2024
Language model alignment with elastic reset
M Noukhovitch, S Lavoie, F Strub, AC Courville
Advances in Neural Information Processing Systems 36, 3439-3461, 2023
342023
Commonsense mining as knowledge base completion? a study on the impact of novelty
S Jastrzębski, D Bahdanau, S Hosseini, M Noukhovitch, Y Bengio, ...
Proceedings of the Workshop on Generalization in the Age of Deep Learning, 8-16, 2018
322018
Simplicial embeddings in self-supervised learning and downstream classification
S Lavoie, C Tsirigotis, M Schwarzer, A Vani, M Noukhovitch, K Kawaguchi, ...
arXiv preprint arXiv:2204.00616, 2022
312022
Learning multi-agent communication with contrastive learning
YL Lo, B Sengupta, J Foerster, M Noukhovitch
arXiv preprint arXiv:2307.01403, 2023
192023
Oríon: Experiment version control for efficient hyperparameter optimization
C Tsirigotis, X Bouthillier, F Corneau-Tremblay, P Henderson, R Askari, ...
14*2018
Epistimio/orion: Asynchronous distributed hyperparameter optimization
X Bouthillier, C Tsirigotis, F Corneau-Tremblay, T Schweizer, L Dong, ...
Nov, 2021
82021
Olmo 3
T Olmo, A Ettinger, A Bertsch, B Kuehl, D Graham, D Heineman, ...
arXiv preprint arXiv:2512.13961, 2025
12025
Learning Robust Social Strategies with Large Language Models
D Piche, M Muqeeth, M Aghajohari, J Duque, M Noukhovitch, A Courville
arXiv preprint arXiv:2511.19405, 2025
2025
Compositional Discrete Latent Code for High Fidelity, Productive Diffusion Models
S Lavoie, M Noukhovitch, A Courville
arXiv preprint arXiv:2507.12318, 2025
2025
Emergence of Communication with Selfish Agents
M Noukhovitch, T Lacroix, A Lazaridou, A Courville
Evolang, 314, 2020
2020
In-Context Learning, Can It Break Safety?
S Xhonneux, D Dobre, M Noukhovitch, J Tang, G Gidel, D Sridhar
ICML 2024 Next Generation of AI Safety Workshop, 0
Countering Language Drift with KL Regularization
M Noukhovitch, S Lavoie, IH Laradji, D Kiela, F Strub, A Courville
Selfish Emergent Communication
M Noukhovitch, T LaCroix, A Courville
The system can't perform the operation now. Try again later.
Articles 1–18