[go: up one dir, main page]

Follow
Zhifeng Kong
Zhifeng Kong
Senior Research Scientist, NVIDIA
Verified email at ucsd.edu - Homepage
Title
Cited by
Cited by
Year
Diffwave: A versatile diffusion model for audio synthesis
Z Kong, W Ping, J Huang, K Zhao, B Catanzaro
ICLR 2021 (oral), 2021
21082021
On fast sampling of diffusion probabilistic models
Z Kong, W Ping
ICML 2021 Workshop on Invertible Neural Networks, Normalizing Flows, and …, 2021
2442021
A conditional point diffusion-refinement paradigm for 3d point cloud completion
Z Lyu*, Z Kong*, X Xu, L Pan, D Lin
ICLR 2022, 2022
1912022
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities
Z Kong, A Goel, R Badlani, W Ping, R Valle, B Catanzaro
ICML 2024, 2024
1742024
Speech denoising in the waveform domain with self-attention
Z Kong, W Ping, A Dantrey, B Catanzaro
ICASSP 2022, 7867-7871, 2022
1082022
Fastened crown: Tightened neural network robustness certificates
Z Lyu, CY Ko, Z Kong, N Wong, D Lin, L Daniel
AAAI 2020, 2020
882020
Audio Flamingo 2: An audio-language model with long-audio understanding and expert reasoning abilities
S Ghosh, Z Kong, S Kumar, S Sakshi, J Kim, W Ping, R Valle, D Manocha, ...
ICML 2025, 2025
862025
The expressive power of a class of normalizing flow models
Z Kong, K Chaudhuri
AISTATS 2020, 2020
692020
Multi-object tracking using online metric learning with long short-term memory
X Wan, J Wang, Z Kong, Q Zhao, S Deng
ICIP 2018, 2018
512018
Audio flamingo 3: Advancing audio intelligence with fully open large audio language models
A Goel, S Ghosh, J Kim, S Kumar, Z Kong, S Lee, CHH Yang, ...
NeurIPS 2025, 2025
492025
Understanding instance-based interpretability of variational auto-encoders
Z Kong, K Chaudhuri
NeurIPS 2021, 2021
432021
Tangoflux: Super fast and faithful text to audio generation with flow matching and clap-ranked preference optimization
CY Hung, N Majumder, Z Kong, A Mehrish, AA Bagherzadeh, C Li, ...
arXiv preprint arXiv:2412.21037, 2024
412024
Can membership inferencing be refuted?
Z Kong, AR Chowdhury, K Chaudhuri
arXiv preprint arXiv:2303.03648, 2023
21*2023
Improving text-to-audio models with synthetic captions
Z Kong, S Lee, D Ghosal, N Majumder, A Mehrish, R Valle, S Poria, ...
SynData4GenAI 2024, 2024
192024
Data redaction from pre-trained gans
Z Kong, K Chaudhuri
IEEE SaTML 2023, 2023
192023
Cleanunet 2: A hybrid speech denoising model on waveform and spectrogram
Z Kong, W Ping, A Dantrey, B Catanzaro
INTERSPEECH 2023, 2023
142023
Fugatto 1: Foundational Generative Audio Transformer Opus 1
R Valle, R Badlani, Z Kong, S Lee, A Goel, S Kim, JF Santos, S Dai, ...
ICLR 2025, 2024
132024
Audio dialogues: Dialogues dataset for audio and music understanding
A Goel, Z Kong, R Valle, B Catanzaro
arXiv preprint arXiv:2404.07616, 2024
102024
Universal approximation of residual flows in maximum mean discrepancy
Z Kong, K Chaudhuri
ICML 2021 Workshop on Invertible Neural Networks, Normalizing Flows, and …, 2021
92021
ETTA: Elucidating the Design Space of Text-to-Audio Models
S Lee*, Z Kong*, A Goel, S Kim, R Valle, B Catanzaro
ICML 2025, 2024
82024
The system can't perform the operation now. Try again later.
Articles 1–20