[go: up one dir, main page]

Follow
Sainbayar Sukhbaatar
Sainbayar Sukhbaatar
FAIR team, Meta AI
Verified email at meta.com - Homepage
Title
Cited by
Cited by
Year
End-To-End Memory Networks
S Sukhbaatar, A Szlam, J Weston, R Fergus
35592015
Learning multiagent communication with backpropagation
S Sukhbaatar, A Szlam, R Fergus
Advances in Neural Information Processing Systems, 2244-2252, 2016
16862016
Training Convolutional Networks with Noisy Labels
S Sukhbaatar, J Bruna, M Paluri, L Bourdev, R Fergus
Accepted as a workshop contribution at ICLR 2015, 2014
1071*2014
Self-rewarding language models
W Yuan, RY Pang, K Cho, X Li, S Sukhbaatar, J Xu, JE Weston
Forty-first International Conference on Machine Learning, 2024
6302024
Intrinsic motivation and automatic curricula via asymmetric self-play
S Sukhbaatar, Z Lin, I Kostrikov, G Synnaeve, A Szlam, R Fergus
arXiv preprint arXiv:1703.05407, 2017
4862017
Learning when to communicate at scale in multiagent cooperative and competitive tasks
A Singh, T Jain, S Sukhbaatar
arXiv preprint arXiv:1812.09755, 2018
4592018
Simple baseline for visual question answering
B Zhou, Y Tian, S Sukhbaatar, A Szlam, R Fergus
arXiv preprint arXiv:1512.02167, 2015
4462015
Adaptive attention span in transformers
S Sukhbaatar, E Grave, P Bojanowski, A Joulin
arXiv preprint arXiv:1905.07799, 2019
3862019
Training large language models to reason in a continuous latent space
S Hao, S Sukhbaatar, DJ Su, X Li, Z Hu, J Weston, Y Tian
arXiv preprint arXiv:2412.06769, 2024
3012024
Iterative reasoning preference optimization
RY Pang, W Yuan, H He, K Cho, S Sukhbaatar, J Weston
Advances in Neural Information Processing Systems 37, 116617-116637, 2024
2802024
Hash layers for large sparse models
S Roller, S Sukhbaatar, J Weston
advances in neural information processing systems 34, 17555-17566, 2021
2642021
Augmenting self-attention with persistent memory
S Sukhbaatar, E Grave, G Lample, H Jegou, A Joulin
arXiv preprint arXiv:1907.01470, 2019
1562019
Meta-rewarding language models: Self-improving alignment with llm-as-a-meta-judge
T Wu, W Yuan, O Golovneva, J Xu, Y Tian, J Jiao, JE Weston, ...
Proceedings of the 2025 Conference on Empirical Methods in Natural Language …, 2025
1532025
Teaching large language models to reason with reinforcement learning
A Havrilla, Y Du, SC Raparthy, C Nalmpantis, J Dwivedi-Yu, ...
arXiv preprint arXiv:2403.04642, 2024
1512024
Some things are more cringe than others: Preference optimization with the pairwise cringe loss
J Xu, A Lee, S Sukhbaatar, J Weston
CoRR, 2023
1172023
Memory-augmented reinforcement learning for image-goal navigation
L Mezghan, S Sukhbaatar, T Lavril, O Maksymets, D Batra, P Bojanowski, ...
2022 IEEE/RSJ International Conference on Intelligent Robots and Systems …, 2022
1152022
System 2 attention (is something you might need too)
J Weston, S Sukhbaatar
arXiv preprint arXiv:2311.11829, 2023
982023
Composable planning with attributes
A Zhang, S Sukhbaatar, A Lerer, A Szlam, R Fergus
International Conference on Machine Learning, 5842-5851, 2018
972018
Mazebase: A sandbox for learning from games
S Sukhbaatar, A Szlam, G Synnaeve, S Chintala, R Fergus
arXiv preprint arXiv:1511.07401, 2015
912015
Addressing Some Limitations of Transformers with Feedback Memory
A Fan, T Lavril, E Grave, A Joulin, S Sukhbaatar
arXiv preprint arXiv:2002.09402, 2020
85*2020
The system can't perform the operation now. Try again later.
Articles 1–20