Zeqian Ju

Cited by

	All	Since 2021
Citations	1822	1814
h-index	12	12
i10-index	13	13

960

480

240

720

20202021202220232024202520267 34 90 193 524 946 22

Public access

View all

1 article

0 articles

available

not available

Based on funding mandates

Zeqian Ju

University of Science and Technology of China

Verified email at mail.ustc.edu.cn


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Meddialog: Two large-scale medical dialogue datasets X He, S Chen, Z Ju, X Dong, H Fang, S Wang, Y Yang, J Zeng, R Zhang, ... arXiv preprint arXiv:2004.03329, 2020	370*	2020
Naturalspeech 2: Latent diffusion models are natural and zero-shot speech and singing synthesizers K Shen, Z Ju, X Tan, Y Liu, Y Leng, L He, T Qin, S Zhao, J Bian arXiv preprint arXiv:2304.09116, 2023	363	2023
Naturalspeech 3: Zero-shot speech synthesis with factorized codec and diffusion models Z Ju, Y Wang, K Shen, X Tan, D Xin, D Yang, Y Liu, Y Leng, K Song, ... arXiv preprint arXiv:2403.03100, 2024	341	2024
Musicbert: Symbolic music understanding with large-scale pre-training M Zeng, X Tan, R Wang, Z Ju, T Qin, TY Liu arXiv preprint arXiv:2106.05630, 2021	220	2021
Audit: Audio editing by following instructions with latent diffusion models Y Wang, Z Ju, X Tan, L He, Z Wu, J Bian Advances in Neural Information Processing Systems 36, 71340-71357, 2023	114	2023
Kimi-audio technical report D Ding, Z Ju, Y Leng, S Liu, T Liu, Z Shang, K Shen, W Song, X Tan, ... arXiv preprint arXiv:2504.18425, 2025	113	2025
Prompttts 2: Describing and generating voices with text prompt Y Leng, Z Guo, K Shen, X Tan, Z Ju, Y Liu, Y Liu, D Yang, L Zhang, ... arXiv preprint arXiv:2309.02285, 2023	67	2023
On the generation of medical dialogs for COVID-19 M Zhou, Z Li, B Tan, G Zeng, W Yang, X He, Z Ju, S Chakravorty, S Chen, ... Proceedings of the 59th Annual Meeting of the Association for Computational …, 2021	65*	2021
Telemelody: Lyric-to-melody generation with a template-based two-stage method Z Ju, P Lu, X Tan, R Wang, C Zhang, S Wu, K Zhang, X Li, T Qin, TY Liu arXiv preprint arXiv:2109.09617, 2021	57	2021
Rall-e: Robust codec language modeling with chain-of-thought prompting for text-to-speech synthesis D Xin, X Tan, K Shen, Z Ju, D Yang, Y Wang, S Takamichi, H Saruwatari, ... arXiv preprint arXiv:2404.03204, 2024	45	2024
Flashspeech: Efficient zero-shot speech synthesis Z Ye, Z Ju, H Liu, X Tan, J Chen, Y Lu, P Sun, J Pan, W Bian, S He, W Xue, ... Proceedings of the 32nd ACM International Conference on Multimedia, 6998-7007, 2024	34	2024
MoonCast: High-quality zero-shot podcast generation Z Ju, D Yang, J Yu, K Shen, Y Leng, Z Wang, X Tan, X Zhou, T Qin, X Li arXiv preprint arXiv:2503.14345, 2025	17	2025
ALMTokenizer: A Low-bitrate and Semantic-rich Audio Codec Tokenizer for Audio Language Modeling D Yang, S Liu, H Guo, J Zhao, Y Wang, H Wang, Z Ju, X Liu, X Chen, ... arXiv preprint arXiv:2504.10344, 2025	11	2025
Freeaudio: Training-free timing planning for controllable long-form text-to-audio generation Y Jiang, Z Chen, Z Ju, C Li, W Dou, J Zhu Proceedings of the 33rd ACM International Conference on Multimedia, 9871-9880, 2025	5	2025
HeartMuLa: A Family of Open Sourced Music Foundation Models D Yang, Y Xie, Y Yin, Z Wang, X Yi, G Zhu, X Weng, Z Xiong, Y Ma, ... arXiv preprint arXiv:2601.10547, 2026		2026
ControlAudio: Tackling Text-Guided, Timing-Indicated and Intelligible Audio Generation via Progressive Diffusion Modeling Y Jiang, Z Chen, Z Ju, Y Dai, W Dou, J Zhu arXiv preprint arXiv:2510.08878, 2025		2025

The system can't perform the operation now. Try again later.

Articles 1–16

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by