| Roberta: A robustly optimized bert pretraining approach Y Liu, M Ott, N Goyal, J Du, M Joshi, D Chen, O Levy, M Lewis, ... arXiv preprint arXiv:1907.11692, 2019 | 39945* | 2019 |
| Llama: Open and efficient foundation language models H Touvron, T Lavril, G Izacard, X Martinet, MA Lachaux, T Lacroix, ... arXiv preprint arXiv:2302.13971, 2023 | 23165 | 2023 |
| Llama 2: Open foundation and fine-tuned chat models H Touvron, L Martin, K Stone, P Albert, A Almahairi, Y Babaei, ... arXiv preprint arXiv:2307.09288, 2023 | 20956 | 2023 |
| Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension M Lewis, Y Liu, N Goyal, M Ghazvininejad, A Mohamed, O Levy, ... https://www.aclweb.org/anthology/2020.acl-main.703/, 2019 | 15288 | 2019 |
| Retrieval-augmented generation for knowledge-intensive nlp tasks P Lewis, E Perez, A Piktus, F Petroni, V Karpukhin, N Goyal, H Küttler, ... https://papers.nips.cc/paper/2020/hash/6b493230205f780e1bc26945df7481e5 …, 2020 | 14954 | 2020 |
| The llama 3 herd of models A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ... arXiv e-prints, arXiv: 2407.21783, 2024 | 12293* | 2024 |
| Unsupervised cross-lingual representation learning at scale A Conneau, K Khandelwal, N Goyal, V Chaudhary, G Wenzek, F Guzmán, ... https://www.aclweb.org/anthology/2020.acl-main.747.pdf, 2019 | 9188 | 2019 |
| Opt: Open pre-trained transformer language models S Zhang, S Roller, N Goyal, M Artetxe, M Chen, S Chen, C Dewan, ... arXiv preprint arXiv:2205.01068, 2022 | 5242* | 2022 |
| Multilingual denoising pre-training for neural machine translation Y Liu, J Gu, N Goyal, X Li, S Edunov, M Ghazvininejad, M Lewis, ... Transactions of the Association for Computational Linguistics 8, 726-742, 2020 | 2409 | 2020 |
| Recipes for building an open-domain chatbot S Roller, E Dinan, N Goyal, D Ju, M Williamson, Y Liu, J Xu, M Ott, ... EACL 2020, 2020 | 1345 | 2020 |
| Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities G Comanici, E Bieber, M Schaekermann, I Pasupat, N Sachdeva, I Dhillon, ... arXiv preprint arXiv:2507.06261, 2025 | 1337 | 2025 |
| Beyond english-centric multilingual machine translation A Fan, S Bhosale, H Schwenk, Z Ma, A El-Kishky, S Goyal, M Baines, ... Journal of Machine Learning Research 22 (107), 1-48, 2021 | 1105 | 2021 |
| XLS-R: Self-supervised cross-lingual speech representation learning at scale A Babu, C Wang, A Tjandra, K Lakhotia, Q Xu, N Goyal, K Singh, ... Interspeech 2022, 2021 | 1059 | 2021 |
| Multilingual translation with extensible multilingual pretraining and finetuning Y Tang, C Tran, X Li, PJ Chen, N Goyal, V Chaudhary, J Gu, A Fan Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2020 | 740* | 2020 |
| The flores-101 evaluation benchmark for low-resource and multilingual machine translation N Goyal, C Gao, V Chaudhary, PJ Chen, G Wenzek, D Ju, S Krishnan, ... Transactions of the Association for Computational Linguistics 10, 522-538, 2022 | 666 | 2022 |
| LLaMA: open and efficient foundation language models. arXiv H Touvron, T Lavril, G Izacard, X Martinet, MA Lachaux, T Lacroix, ... arXiv preprint arXiv:2302.13971 2, 2023 | 439 | 2023 |
| Blenderbot 3: a deployed conversational agent that continually learns to responsibly engage K Shuster, J Xu, M Komeili, D Ju, EM Smith, S Roller, M Ung, M Chen, ... arXiv preprint arXiv:2208.03188, 2022 | 376 | 2022 |
| Base layers: Simplifying training of large, sparse models M Lewis, S Bhosale, T Dettmers, N Goyal, L Zettlemoyer International Conference on Machine Learning, 6265-6274, 2021 | 367 | 2021 |
| Few-shot learning with multilingual language models XV Lin, T Mihaylov, M Artetxe, T Wang, S Chen, D Simig, M Ott, N Goyal, ... arXiv preprint arXiv:2112.10668, 2021 | 348* | 2021 |
| Better fine-tuning by reducing representational collapse A Aghajanyan, A Shrivastava, A Gupta, N Goyal, L Zettlemoyer, S Gupta ICLR 2021, 2020 | 292 | 2020 |