| Msemotts: Multi-scale emotion transfer, prediction, and control for emotional speech synthesis Y Lei, S Yang, X Wang, L Xie IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 853-864, 2022 | 112 | 2022 |
| Fine-grained emotion strength transfer, control and prediction for emotional speech synthesis Y Lei, S Yang, L Xie 2021 IEEE Spoken Language Technology Workshop (SLT), 423-430, 2021 | 83 | 2021 |
| Promptstyle: Controllable style transfer for text-to-speech with natural language descriptions G Liu, Y Zhang, Y Lei, Y Chen, R Wang, Z Li, L Xie arXiv preprint arXiv:2305.19522, 2023 | 60 | 2023 |
| Promptvc: Flexible stylistic voice conversion in latent space driven by natural language prompts J Yao, Y Yang, Y Lei, Z Ning, Y Hu, Y Pan, J Yin, H Zhou, H Lu, L Xie ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 49 | 2024 |
| Metts: Multilingual emotional text-to-speech by cross-speaker and cross-lingual emotion transfer X Zhu, Y Lei, T Li, Y Zhang, H Zhou, H Lu, L Xie IEEE/ACM Transactions on Audio, Speech, and Language Processing 32, 1506-1518, 2024 | 40 | 2024 |
| Vec-tok speech: speech vectorization and tokenization for neural speech generation X Zhu, Y Lv, Y Lei, T Li, W He, H Zhou, H Lu, L Xie IEEE Transactions on Audio, Speech and Language Processing, 2025 | 27 | 2025 |
| Multi-speaker expressive speech synthesis via multiple factors decoupling X Zhu, Y Lei, K Song, Y Zhang, T Li, L Xie ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 25 | 2023 |
| DSPGAN: A gan-based universal vocoder for high-fidelity tts by time-frequency domain supervision from dsp K Song, Y Zhang, Y Lei, J Cong, H Li, L Xie, G He, J Bai ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 24 | 2023 |
| Cross-speaker emotion transfer through information perturbation in emotional speech synthesis Y Lei, S Yang, X Zhu, L Xie, D Su IEEE Signal Processing Letters 29, 1948-1952, 2022 | 24 | 2022 |
| Promptspeaker: Speaker generation based on text descriptions Y Zhang, G Liu, Y Lei, Y Chen, H Yin, L Xie, Z Li 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-7, 2023 | 22 | 2023 |
| Distinguishable speaker anonymization based on formant and fundamental frequency scaling J Yao, Q Wang, Y Lei, P Guo, L Xie, N Wang, J Liu ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 21 | 2023 |
| Glow-wavegan 2: High-quality zero-shot text-to-speech synthesis and any-to-any voice conversion Y Lei, S Yang, J Cong, L Xie, D Su arXiv preprint arXiv:2207.01832, 2022 | 21 | 2022 |
| Preserving background sound in noise-robust voice conversion via multi-task learning J Yao, Y Lei, Q Wang, P Guo, Z Ning, L Xie, H Li, J Liu, D Xie ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 16 | 2023 |
| Unisyn: an end-to-end unified model for text-to-speech and singing voice synthesis Y Lei, S Yang, X Wang, Q Xie, J Yao, L Xie, D Su Proceedings of the AAAI Conference on Artificial Intelligence 37 (11), 13025 …, 2023 | 15 | 2023 |
| Learn2sing: Target speaker singing voice synthesis by learning from a singing teacher H Xue, S Yang, Y Lei, L Xie, X Li 2021 IEEE Spoken Language Technology Workshop (SLT), 522-529, 2021 | 15 | 2021 |
| Styles2st: Zero-shot style transfer for direct speech-to-speech translation K Song, Y Ren, Y Lei, C Wang, K Wei, L Xie, X Yin, Z Ma arXiv preprint arXiv:2305.17732, 2023 | 14 | 2023 |
| VITS-based Singing Voice Conversion System with DSPGAN post-processing for SVCC2023 Y Zhou, M Chen, Y Lei, J Zhu, W Zhao 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023 | 11 | 2023 |
| End-to-end voice conversion with information perturbation Q Xie, S Yang, Y Lei, L Xie, D Su 2022 13th International Symposium on Chinese Spoken Language Processing …, 2022 | 11 | 2022 |
| Accent-VITS: Accent transfer for end-to-end TTS L Ma, Y Zhang, X Zhu, Y Lei, Z Ning, P Zhu, L Xie National Conference on Man-Machine Speech Communication, 203-214, 2023 | 8 | 2023 |
| Zero-shot emotion transfer for cross-lingual speech synthesis Y Li, X Zhu, Y Lei, H Li, J Liu, D Xie, L Xie 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023 | 7 | 2023 |