[go: up one dir, main page]

Follow
Zalán Borsos
Zalán Borsos
Microsoft AI
Verified email at microsoft.com - Homepage
Title
Cited by
Cited by
Year
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
G Team, P Georgiev, VI Lei, R Burnell, L Bai, A Gulati, G Tanzer, ...
arXiv preprint arXiv:2403.05530, 2024
34672024
Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities
G Comanici, E Bieber, M Schaekermann, I Pasupat, N Sachdeva, I Dhillon, ...
arXiv preprint arXiv:2507.06261, 2025
12592025
MusicLM: Generating Music From Text
A Agostinelli, TI Denk, Z Borsos, J Engel, M Verzetti, A Caillon, Q Huang, ...
arXiv preprint arXiv:2301.11325, 2023
10382023
Audiolm: a language modeling approach to audio generation
Z Borsos, R Marinier, D Vincent, E Kharitonov, O Pietquin, M Sharifi, ...
IEEE/ACM transactions on audio, speech, and language processing 31, 2523-2533, 2023
9712023
Audiopalm: A large language model that can speak and listen
PK Rubenstein, C Asawaroengchai, DD Nguyen, A Bapna, Z Borsos, ...
arXiv preprint arXiv:2306.12925, 2023
3572023
Coresets via Bilevel Optimization for Continual Learning and Streaming
Z Borsos, M Mutný, A Krause
NeurIPS 2020 - Advances in Neural Information Processing Systems, 2020
3522020
Speak, read and prompt: High-fidelity text-to-speech with minimal supervision
E Kharitonov, D Vincent, Z Borsos, R Marinier, S Girgin, O Pietquin, ...
Transactions of the Association for Computational Linguistics 11, 1703-1718, 2023
2742023
SoundStorm: Efficient Parallel Audio Generation
Z Borsos, M Sharifi, D Vincent, E Kharitonov, N Zeghidour, M Tagliasacchi
arXiv preprint arXiv:2305.09636, 2023
1662023
SpeechPainter: Text-conditioned Speech Inpainting
Z Borsos, M Sharifi, M Tagliasacchi
arXiv preprint arXiv:2202.07273, 2022
532022
MusicRL: Aligning Music Generation to Human Preferences
G Cideron, S Girgin, M Verzetti, D Vincent, M Kastelic, Z Borsos, ...
arXiv preprint arXiv:2402.04229, 2024
452024
LMCodec: A Low Bitrate Speech Codec with Causal Transformer Models
T Jenrungrot, M Chinen, WB Kleijn, J Skoglund, Z Borsos, N Zeghidour, ...
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
412023
Dealing with overlap and imbalance: a new metric and approach
Z Borsos, C Lemnaru, R Potolea
Pattern Analysis and Applications, 1-15, 2016
372016
Online Variance Reduction for Stochastic Optimization
Z Borsos, A Krause, KY Levy
Proceedings of the 31st Conference On Learning Theory 75, 324--357, 2018
312018
Semi-supervised Batch Active Learning via Bilevel Optimization
Z Borsos, M Tagliasacchi, A Krause
ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021
282021
TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition
H Erdogan, S Wisdom, X Chang, Z Borsos, M Tagliasacchi, N Zeghidour, ...
arXiv preprint arXiv:2308.10415, 2023
202023
Disentangling speech from surroundings with neural embeddings
A Omran, N Zeghidour, Z Borsos, F de Chaumont Quitry, M Slaney, ...
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
192023
Online Variance Reduction with Mixtures
Z Borsos, S Curi, KY Levy, A Krause
ICML 2019 - Proceedings of the 36th International Conference on Machine …, 2019
192019
Data summarization via bilevel optimization
Z Borsos, M Mutný, M Tagliasacchi, A Krause
Journal of Machine Learning Research 25 (73), 1-53, 2024
152024
Implementing Modular FFTs in FPGAs--A Basic Block for Lattice-Based Cryptography
T Györfi, O Cret, Z Borsos
Digital System Design (DSD), 2013 Euromicro Conference on, 305-308, 2013
112013
Inference of the three-dimensional chromatin structure and its temporal behavior
BC Cristescu, Z Borsos, J Lygeros, MR Martínez, MA Rapsomaniki
arXiv preprint arXiv:1811.09619, 2018
92018
The system can't perform the operation now. Try again later.
Articles 1–20