Pan Zexu

Cited by

	All	Since 2021
Citations	1082	1080
h-index	17	17
i10-index	22	22

480

240

120

360

20212022202320242025202619 107 205 275 463 9

Public access

View all

12 articles

3 articles

available

not available

Based on funding mandates

Co-authors

Haizhou LiThe Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), China; NUS, SingaporeVerified email at u.nus.edu
Tao RuijieZoom, NUSVerified email at u.nus.edu
Xinyuan QianAssociate Professor, University of Science and Technology Beijing, ChinaVerified email at nus.edu.sg
Jonathan Le RouxMERLVerified email at merl.com
Meng GeTianjin University; CUHK-Shenzhen; National University of SingaporeVerified email at nus.edu.sg
Chenglin XuKuaishou Technology, ChinaVerified email at kuaishou.com
Tanja SchultzProfessor of Computer Science, University BremenVerified email at uni-bremen.de
Zhaojie LuoOsaka University Assistant ProfessorVerified email at irl.sys.es.osaka-u.ac.jp

Pan Zexu

Alibaba; MERL; National University of Singapore

Verified email at u.nus.edu - Homepage

Multi-media Speaker extraction


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Is someone speaking? exploring long-term temporal features for audio-visual active speaker detection R Tao, Z Pan, RK Das, X Qian, MZ Shou, H Li Proceedings of the 29th ACM international conference on multimedia, 3927-3935, 2021	254	2021
Multi-modal Attention for Speech Emotion Recognition Z Pan, Z Luo, J Yang, H Li Proc. Interspeech 2020, 364--368, 2020	122	2020
Muse: Multi-modal target speaker extraction with visual cues Z Pan, R Tao, C Xu, H Li ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021	80	2021
USEV: Universal speaker extraction with visual cue Z Pan, M Ge, H Li IEEE/ACM Transactions on Audio, Speech and Language Processing 30, 3032 - 3045, 2022	66	2022
Selective listening by synchronizing speech with lips Z Pan, R Tao, C Xu, H Li IEEE/ACM Transactions on Audio, Speech and Language Processing 30, 1650 - 1664, 2022	62	2022
Multi-target DoA estimation with an audio-visual fusion mechanism X Qian, M Madhavi, Z Pan, J Wang, H Li ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021	58	2021
Speaker Extraction with Co-Speech Gestures Cue Z Pan, X Qian, H Li IEEE Signal Processing Letters 29, 1467 - 1471, 2022	35	2022
NeuroHeed: Neuro-steered speaker extraction using EEG signals Z Pan, M Borsdorf, S Cai, T Schultz, H Li IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024	34	2024
TF-Locoformer: Transformer with local modeling by convolution for speech separation and enhancement K Saijo, G Wichern, FG Germain, Z Pan, J Le Roux 2024 18th International Workshop on Acoustic Signal Enhancement (IWAENC …, 2024	34	2024
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction Z Pan, M Ge, H Li Proc. Interspeech 2022, 2022	29	2022
Target active speaker detection with audio-visual cues Y Jiang, R Tao, Z Pan, H Li arXiv preprint arXiv:2305.12831, 2023	26	2023
NIIRF: Neural IIR filter field for HRTF upsampling and personalization Y Masuyama, G Wichern, FG Germain, Z Pan, S Khurana, C Hori, ... ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	25	2024
Restoring speaking lips from occlusion for audio-visual speech recognition J Wang, Z Pan, M Zhang, RT Tan, H Li Proceedings of the AAAI conference on artificial intelligence 38 (17), 19144 …, 2024	22	2024
Scenario-aware audio-visual TF-Gridnet for target speech extraction Z Pan, G Wichern, Y Masuyama, FG Germain, S Khurana, C Hori, ... 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023	22	2023
Time-domain speech separation networks with graph encoding auxiliary T Wang, Z Pan, M Ge, Z Yang, H Li IEEE Signal Processing Letters 30, 110-114, 2023	21	2023
NeuroHeed+: Improving neuro-steered speaker extraction with joint auditory attention detection Z Pan, G Wichern, FG Germain, S Khurana, J Le Roux ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	18	2024
Generation or replication: Auscultating audio latent diffusion models D Bralios, G Wichern, FG Germain, Z Pan, S Khurana, C Hori, J Le Roux ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	17	2024
ImagineNET: Target Speaker Extraction with Intermittent Visual Cue through Embedding Inpainting Z Pan, W Wang, M Borsdorf, H Li ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2022	16	2022
Emotional dimension control in language model-based text-to-speech: Spanning a broad spectrum of human emotions K Zhou, Y Zhang, S Zhao, H Wang, Z Pan, D Ng, C Zhang, C Ni, Y Ma, ... arXiv preprint arXiv:2409.16681, 2024	14	2024
Speech separation with pretrained frontend to minimize domain mismatch W Wang, Z Pan, X Li, S Wang, H Li IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024	13	2024

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors