| Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context G Team, P Georgiev, VI Lei, R Burnell, L Bai, A Gulati, G Tanzer, ... arXiv preprint arXiv:2403.05530, 2024 | 3467 | 2024 |
| Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities G Comanici, E Bieber, M Schaekermann, I Pasupat, N Sachdeva, I Dhillon, ... arXiv preprint arXiv:2507.06261, 2025 | 1259 | 2025 |
| MusicLM: Generating Music From Text A Agostinelli, TI Denk, Z Borsos, J Engel, M Verzetti, A Caillon, Q Huang, ... arXiv preprint arXiv:2301.11325, 2023 | 1038 | 2023 |
| Audiolm: a language modeling approach to audio generation Z Borsos, R Marinier, D Vincent, E Kharitonov, O Pietquin, M Sharifi, ... IEEE/ACM transactions on audio, speech, and language processing 31, 2523-2533, 2023 | 971 | 2023 |
| Audiopalm: A large language model that can speak and listen PK Rubenstein, C Asawaroengchai, DD Nguyen, A Bapna, Z Borsos, ... arXiv preprint arXiv:2306.12925, 2023 | 357 | 2023 |
| Coresets via Bilevel Optimization for Continual Learning and Streaming Z Borsos, M Mutný, A Krause NeurIPS 2020 - Advances in Neural Information Processing Systems, 2020 | 352 | 2020 |
| Speak, read and prompt: High-fidelity text-to-speech with minimal supervision E Kharitonov, D Vincent, Z Borsos, R Marinier, S Girgin, O Pietquin, ... Transactions of the Association for Computational Linguistics 11, 1703-1718, 2023 | 274 | 2023 |
| SoundStorm: Efficient Parallel Audio Generation Z Borsos, M Sharifi, D Vincent, E Kharitonov, N Zeghidour, M Tagliasacchi arXiv preprint arXiv:2305.09636, 2023 | 166 | 2023 |
| SpeechPainter: Text-conditioned Speech Inpainting Z Borsos, M Sharifi, M Tagliasacchi arXiv preprint arXiv:2202.07273, 2022 | 53 | 2022 |
| MusicRL: Aligning Music Generation to Human Preferences G Cideron, S Girgin, M Verzetti, D Vincent, M Kastelic, Z Borsos, ... arXiv preprint arXiv:2402.04229, 2024 | 45 | 2024 |
| LMCodec: A Low Bitrate Speech Codec with Causal Transformer Models T Jenrungrot, M Chinen, WB Kleijn, J Skoglund, Z Borsos, N Zeghidour, ... ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 41 | 2023 |
| Dealing with overlap and imbalance: a new metric and approach Z Borsos, C Lemnaru, R Potolea Pattern Analysis and Applications, 1-15, 2016 | 37 | 2016 |
| Online Variance Reduction for Stochastic Optimization Z Borsos, A Krause, KY Levy Proceedings of the 31st Conference On Learning Theory 75, 324--357, 2018 | 31 | 2018 |
| Semi-supervised Batch Active Learning via Bilevel Optimization Z Borsos, M Tagliasacchi, A Krause ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 28 | 2021 |
| TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition H Erdogan, S Wisdom, X Chang, Z Borsos, M Tagliasacchi, N Zeghidour, ... arXiv preprint arXiv:2308.10415, 2023 | 20 | 2023 |
| Disentangling speech from surroundings with neural embeddings A Omran, N Zeghidour, Z Borsos, F de Chaumont Quitry, M Slaney, ... ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 19 | 2023 |
| Online Variance Reduction with Mixtures Z Borsos, S Curi, KY Levy, A Krause ICML 2019 - Proceedings of the 36th International Conference on Machine …, 2019 | 19 | 2019 |
| Data summarization via bilevel optimization Z Borsos, M Mutný, M Tagliasacchi, A Krause Journal of Machine Learning Research 25 (73), 1-53, 2024 | 15 | 2024 |
| Implementing Modular FFTs in FPGAs--A Basic Block for Lattice-Based Cryptography T Györfi, O Cret, Z Borsos Digital System Design (DSD), 2013 Euromicro Conference on, 305-308, 2013 | 11 | 2013 |
| Inference of the three-dimensional chromatin structure and its temporal behavior BC Cristescu, Z Borsos, J Lygeros, MR Martínez, MA Rapsomaniki arXiv preprint arXiv:1811.09619, 2018 | 9 | 2018 |