Guangzhi Sun

Cited by

	All	Since 2021
Citations	2003	1962
h-index	18	18
i10-index	27	27

1200

600

300

900

202020212022202320242025202633 106 99 130 405 1158 57

Public access

View all

7 articles

0 articles

available

not available

Based on funding mandates

Guangzhi Sun

Other namesBrian Sun

University of Cambridge

Verified email at cam.ac.uk - Homepage

Speech and language technology conversational AI


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Salmonn: Towards generic hearing abilities for large language models C Tang, W Yu, G Sun, X Chen, T Tan, W Li, L Lu, Z Ma, C Zhang arXiv preprint arXiv:2310.13289, 2023	623	2023
Large language models surpass human experts in predicting neuroscience results X Luo, A Rechardt, G Sun, KK Nejad, F Yáñez, B Yilmaz, K Lee, ... Nature human behaviour 9 (2), 305-315, 2025	209	2025
Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis G Sun, Y Zhang, RJ Weiss, Y Cao, H Zen, Y Wu ICASSP 2020-2020 IEEE international conference on acoustics, speech and …, 2020	160	2020
Generating diverse and natural text-to-speech samples using a quantized fine-grained vae and autoregressive prosody prior G Sun, Y Zhang, RJ Weiss, Y Cao, H Zen, A Rosenberg, B Ramabhadran, ... ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020	130*	2020
Connecting speech encoder and large language model for asr W Yu, C Tang, G Sun, X Chen, T Tan, W Li, L Lu, Z Ma, C Zhang ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	104	2024
video-salmonn: Speech-enhanced audio-visual large language models G Sun, W Yu, C Tang, X Chen, T Tan, W Li, L Lu, Z Ma, Y Wang, C Zhang arXiv preprint arXiv:2406.15704, 2024	92*	2024
Building better ai agents: A provocation on the utilisation of persona in llm-based conversational agents G Sun, X Zhan, J Such Proceedings of the 6th ACM Conference on Conversational User Interfaces, 1-6, 2024	76	2024
Transformer language models with LSTM-based cross-utterance information representation G Sun, C Zhang, PC Woodland ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021	48	2021
TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch J Hwang, M Hira, C Chen, X Zhang, Z Ni, G Sun, P Ma, R Huang, V Pratap, ... 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-9, 2023	44	2023
Speaker diarisation using 2D self-attentive combination of embeddings G Sun, C Zhang, PC Woodland ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019	40	2019
Tree-constrained pointer generator for end-to-end contextual speech recognition G Sun, C Zhang, PC Woodland 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2021	37	2021
Salmonn-omni: A codec-free llm for full-duplex speech understanding and generation W Yu, S Wang, X Yang, X Chen, X Tian, J Zhang, G Sun, L Lu, Y Wang, ... arXiv preprint arXiv:2411.18138, 2024	36*	2024
Can contextual biasing remain effective with Whisper and GPT-2? G Sun, X Zheng, C Zhang, PC Woodland arXiv preprint arXiv:2306.01942, 2023	28	2023
Extending large language models for speech and audio captioning C Tang, W Yu, G Sun, X Chen, T Tan, W Li, L Lu, Z Ma, C Zhang ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	26	2024
Affect recognition in conversations using large language models S Feng, G Sun, N Lubis, W Wu, C Zhang, M Gasic Proceedings of the 25th Annual Meeting of the Special Interest Group on …, 2024	24	2024
Combination of deep speaker embeddings for diarisation G Sun, C Zhang, PC Woodland Neural Networks 141, 372-384, 2021	24	2021
Enabling auditory large language models for automatic speech quality evaluation S Wang, W Yu, Y Yang, C Tang, Y Li, J Zhuang, X Chen, X Tian, J Zhang, ... ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and …, 2025	20	2025
Enhancing multimodal LLM for detailed and accurate video captioning using multi-round preference optimization C Tang, Y Li, Y Yang, J Zhuang, G Sun, W Li, Z Ma, C Zhang arXiv preprint arXiv:2410.06682, 2024	20*	2024
Can large language models understand spatial audio? C Tang, W Yu, G Sun, X Chen, T Tan, W Li, J Zhang, L Lu, Z Ma, Y Wang, ... arXiv preprint arXiv:2406.07914, 2024	18	2024
Parameter efficient finetuning for speech emotion recognition and domain adaptation N Lashkarashvili, W Wu, G Sun, PC Woodland ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	17	2024

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by