[go: up one dir, main page]

Follow
Pan Zexu
Pan Zexu
Verified email at u.nus.edu - Homepage
Title
Cited by
Cited by
Year
Is someone speaking? exploring long-term temporal features for audio-visual active speaker detection
R Tao, Z Pan, RK Das, X Qian, MZ Shou, H Li
Proceedings of the 29th ACM international conference on multimedia, 3927-3935, 2021
2542021
Multi-modal Attention for Speech Emotion Recognition
Z Pan, Z Luo, J Yang, H Li
Proc. Interspeech 2020, 364--368, 2020
1222020
Muse: Multi-modal target speaker extraction with visual cues
Z Pan, R Tao, C Xu, H Li
ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021
802021
USEV: Universal speaker extraction with visual cue
Z Pan, M Ge, H Li
IEEE/ACM Transactions on Audio, Speech and Language Processing 30, 3032 - 3045, 2022
662022
Selective listening by synchronizing speech with lips
Z Pan, R Tao, C Xu, H Li
IEEE/ACM Transactions on Audio, Speech and Language Processing 30, 1650 - 1664, 2022
622022
Multi-target DoA estimation with an audio-visual fusion mechanism
X Qian, M Madhavi, Z Pan, J Wang, H Li
ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021
582021
Speaker Extraction with Co-Speech Gestures Cue
Z Pan, X Qian, H Li
IEEE Signal Processing Letters 29, 1467 - 1471, 2022
352022
NeuroHeed: Neuro-steered speaker extraction using EEG signals
Z Pan, M Borsdorf, S Cai, T Schultz, H Li
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
342024
TF-Locoformer: Transformer with local modeling by convolution for speech separation and enhancement
K Saijo, G Wichern, FG Germain, Z Pan, J Le Roux
2024 18th International Workshop on Acoustic Signal Enhancement (IWAENC …, 2024
342024
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction
Z Pan, M Ge, H Li
Proc. Interspeech 2022, 2022
292022
Target active speaker detection with audio-visual cues
Y Jiang, R Tao, Z Pan, H Li
arXiv preprint arXiv:2305.12831, 2023
262023
NIIRF: Neural IIR filter field for HRTF upsampling and personalization
Y Masuyama, G Wichern, FG Germain, Z Pan, S Khurana, C Hori, ...
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
252024
Restoring speaking lips from occlusion for audio-visual speech recognition
J Wang, Z Pan, M Zhang, RT Tan, H Li
Proceedings of the AAAI conference on artificial intelligence 38 (17), 19144 …, 2024
222024
Scenario-aware audio-visual TF-Gridnet for target speech extraction
Z Pan, G Wichern, Y Masuyama, FG Germain, S Khurana, C Hori, ...
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023
222023
Time-domain speech separation networks with graph encoding auxiliary
T Wang, Z Pan, M Ge, Z Yang, H Li
IEEE Signal Processing Letters 30, 110-114, 2023
212023
NeuroHeed+: Improving neuro-steered speaker extraction with joint auditory attention detection
Z Pan, G Wichern, FG Germain, S Khurana, J Le Roux
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
182024
Generation or replication: Auscultating audio latent diffusion models
D Bralios, G Wichern, FG Germain, Z Pan, S Khurana, C Hori, J Le Roux
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
172024
ImagineNET: Target Speaker Extraction with Intermittent Visual Cue through Embedding Inpainting
Z Pan, W Wang, M Borsdorf, H Li
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2022
162022
Emotional dimension control in language model-based text-to-speech: Spanning a broad spectrum of human emotions
K Zhou, Y Zhang, S Zhao, H Wang, Z Pan, D Ng, C Zhang, C Ni, Y Ma, ...
arXiv preprint arXiv:2409.16681, 2024
142024
Speech separation with pretrained frontend to minimize domain mismatch
W Wang, Z Pan, X Li, S Wang, H Li
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
132024
The system can't perform the operation now. Try again later.
Articles 1–20