| Bloom: A 176b-parameter open-access multilingual language model BS Workshop, TL Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, ... arXiv preprint arXiv:2211.05100, 2022 | 2292 | 2022 |
| Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities G Comanici, E Bieber, M Schaekermann, I Pasupat, N Sachdeva, I Dhillon, ... arXiv preprint arXiv:2507.06261, 2025 | 1337 | 2025 |
| Between words and characters: A brief history of open-vocabulary modeling and tokenization in NLP SJ Mielke, Z Alyafeai, E Salesky, C Raffel, M Dey, M Gallé, A Raja, C Si, ... arXiv preprint arXiv:2112.10508, 2021 | 304* | 2021 |
| The Multilingual TEDx Corpus for Speech Recognition and Translation E Salesky, M Wiesner, J Bremerman, R Cattoni, M Negri, M Turchi, ... Interspeech 2021, 2021 | 178 | 2021 |
| Findings of the IWSLT 2020 Evaluation Campaign E Ansari, A Axelrod, N Bach, O Bojar, R Cattoni, F Dalvi, N Durrani, ... Proceedings of IWSLT 2020, 1-34, 2020 | 143 | 2020 |
| Findings of the IWSLT 2022 Evaluation Campaign. A Antonios, B Loc, L Bentivogli, MZ Boito, B Ondřej, R Cattoni, C Anna, ... Proceedings of the 19th International Conference on Spoken Language …, 2022 | 125 | 2022 |
| The IWSLT 2019 Evaluation Campaign J Niehues, R Cattoni, S Stüker, M Negri, M Turchi, TL Ha, E Salesky, ... Proceedings of IWSLT 2019, 2019 | 107 | 2019 |
| Language modelling with pixels P Rust, JF Lotz, E Bugliarello, E Salesky, M de Lhoneux, D Elliott ICLR 2023, 2022 | 95* | 2022 |
| Generalized Entropy Regularization or: There's Nothing Special about Label Smoothing C Meister, E Salesky, R Cotterell ACL 2020, 2020 | 79 | 2020 |
| SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection E Vylomova, J White, E Salesky, SJ Mielke, S Wu, E Ponti, RH Maudslay, ... SIGMORPHON 2020, 2020 | 77 | 2020 |
| Findings of the iwslt 2023 evaluation campaign M Agarwal, S Agrawal, A Anastasopoulos, L Bentivogli, O Bojar, C Borg, ... Proceedings of the 20th International Conference on Spoken Language …, 2023 | 65 | 2023 |
| Robust open-vocabulary translation from visual text representations E Salesky, D Etter, M Post EMNLP 2021, 2021 | 59 | 2021 |
| Fluent Translations from Disfluent Speech in End-to-End Speech Translation E Salesky, M Sperber, A Waibel NAACL 2019, 2019 | 50 | 2019 |
| Optimizing segmentation granularity for neural machine translation E Salesky, A Runge, A Coda, J Niehues, G Neubig Machine Translation 34 (1), 41-59, 2020 | 49 | 2020 |
| Exploring Phoneme-Level Speech Representations for End-to-End Speech Translation E Salesky, M Sperber, AW Black ACL 2019, 2019 | 49* | 2019 |
| UniMorph 4.0: universal morphology K Batsuren, O Goldman, S Khalifa, N Habash, W Kieraś, G Bella, ... Proceedings of the Thirteenth Language Resources and Evaluation Conference …, 2022 | 48 | 2022 |
| Relative Positional Encoding for Speech Recognition and Direct Translation NQ Pham, TL Ha, TN Nguyen, TS Nguyen, E Salesky, S Stueker, ... INTERSPEECH 2020, 2020 | 48 | 2020 |
| A surprisal--duration trade-off across and within the world's languages T Pimentel, C Meister, E Salesky, S Teufel, D Blasi, R Cotterell EMNLP 2021, 2021 | 47 | 2021 |
| A Corpus for Large-Scale Phonetic Typology E Salesky, E Chodroff, T Pimentel, M Wiesner, R Cotterell, AW Black, ... ACL 2020, 2020 | 40 | 2020 |
| WMT24++: Expanding the language coverage of WMT24 to 55 languages & dialects D Deutsch, E Briakou, IR Caswell, M Finkelstein, R Galor, J Juraska, ... Findings of the Association for Computational Linguistics: ACL 2025, 12257-12284, 2025 | 39 | 2025 |