[go: up one dir, main page]

Follow
Rohan Badlani
Rohan Badlani
Computer Science, Stanford University, BITS Pilani
Verified email at cs.stanford.edu
Title
Cited by
Cited by
Year
Audio flamingo: A novel audio language model with few-shot learning and dialogue abilities
Z Kong, A Goel, R Badlani, W Ping, R Valle, B Catanzaro
arXiv preprint arXiv:2402.01831, 2024
1742024
One TTS alignment to rule them all
R Badlani, A Łańcucki, KJ Shih, R Valle, W Ping, B Catanzaro
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
1112022
P-flow: A fast and data-efficient zero-shot TTS through speech prompting
S Kim, K Shih, JF Santos, E Bakhturina, M Desta, R Valle, S Yoon, ...
Advances in Neural Information Processing Systems 36, 74213-74228, 2023
712023
RAD-TTS: Parallel flow-based TTS with robust alignment learning and diverse synthesis
KJ Shih, R Valle, R Badlani, A Lancucki, W Ping, B Catanzaro
ICML Workshop on Invertible Neural Networks, Normalizing Flows, and Explicit …, 2021
692021
Content-based representations of audio using siamese neural networks
P Manocha, R Badlani, A Kumar, A Shah, B Elizalde, B Raj
2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018
662018
Experiments on the DCASE challenge 2016: Acoustic scene classification and sound event detection in real life recording
B Elizalde, A Kumar, A Shah, R Badlani, E Vincent, B Raj, I Lane
arXiv preprint arXiv:1607.06706, 2016
61*2016
NELS-Never-Ending Learner of Sounds
BR Benjamin Elizalde, Rohan Badlani, Ankit Shah, Anurag Kumar
NIPS Workshop on Machine Learning for Audio, 2018
36*2018
Disambiguating sentiment: An ensemble of humour, sarcasm, and hate speech features for sentiment classification
R Badlani, N Asnani, M Rai
W-NUT 2019, 337-345, 2019
31*2019
An approach for self-training audio event detectors using web data
B Elizalde, A Shah, S Dalmia, MH Lee, R Badlani, A Kumar, B Raj, I Lane
2017 25th European Signal Processing Conference (EUSIPCO), 1863-1867, 2017
29*2017
Improving robustness of llm-based speech synthesis by learning monotonic alignment
P Neekhara, S Hussain, S Ghosh, J Li, R Valle, R Badlani, B Ginsburg
arXiv preprint arXiv:2406.17957, 2024
242024
Generating and using joint representations of source code
R Badlani, O Lewis, G Evangelopoulos, O Hatalsky, B Ni
US Patent 11,169,786, 2021
202021
Fugatto 1: Foundational Generative Audio Transformer Opus 1
R Valle, R Badlani, Z Kong, S Lee, A Goel, S Kim, JF Santos, S Dai, ...
The Thirteenth International Conference on Learning Representations, 2025
132025
RAD-MMM: Multilingual multiaccented multispeaker text to speech
R Badlani, R Valle, KJ Shih, JF Santos, S Gururani, B Catanzaro
Proc. Interspeech, 626-630, 2023
132023
Synthesizing video from audio using one or more neural networks
MY Liu, K Nagano, JRVG da Costa, J SEO, TC Wang, A Mallya, S Khamis, ...
US Patent App. 17/382,027, 2023
82023
Framework for evaluation of sound event detection in web videos
R Badlani, A Shah, B Elizalde, A Kumar, B Raj
2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018
82018
Audio flamingo: A novel audio language model with few-shot learning and dialogue abilities (2024)
Z Kong, A Goel, R Badlani, W Ping, R Valle, B Catanzaro
URL https://arxiv. org/abs/2402.01831, 0
7
Generative modeling for low dimensional speech attributes with neural spline flows
KJ Shih, R Valle, R Badlani, JF Santos, B Catanzaro
arXiv preprint arXiv:2203.01786, 2022
62022
Multilingual multiaccented multispeaker TTS with RADTTS
R Badlani, R Valle, KJ Shih, JF Santos, S Gururani, B Catanzaro
arXiv preprint arXiv:2301.10335, 2023
52023
Relation extraction with contextualized relation embedding (CRE)
X Chen, R Badlani
arXiv preprint arXiv:2011.09658, 2020
52020
Pattern-based automatic parallelization of representative-based clustering algorithms
S Islam, S Balasubramaniam, S Gupta, S Brajesh, R Badlani, ...
2018 IEEE 5th International Conference on Data Science and Advanced …, 2018
52018
The system can't perform the operation now. Try again later.
Articles 1–20