| Context-aware feature generation for zero-shot semantic segmentation Z Gu, S Zhou, L Niu, Z Zhao, L Zhang Proceedings of the 28th ACM International Conference on Multimedia, 1921-1929, 2020 | 170 | 2020 |
| Xylayoutlm: Towards layout-aware multimodal networks for visually-rich document understanding Z Gu, C Meng, K Wang, J Lan, W Wang, M Gu, L Zhang Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 132 | 2022 |
| DiffusionInst: Diffusion Model for Instance Segmentation Z Gu, H Chen, Z Xu, J Lan, C Meng, W Wang accepted by icassp2024(oral), 2022 | 129 | 2022 |
| Diffute: Universal text editing diffusion model H Chen, Z Xu, Z Gu, Y Li, C Meng, H Zhu, W Wang Advances in Neural Information Processing Systems 36, 63062-63074, 2023 | 58 | 2023 |
| Demamba: Ai-generated video detection on million-scale genvideo benchmark H Chen, Y Hong, Z Huang, Z Xu, Z Gu, Y Li, J Lan, H Zhu, J Zhang, ... arXiv preprint arXiv:2405.19707, 2024 | 46 | 2024 |
| Hard pixel mining for depth privileged semantic segmentation Z Gu, L Niu, H Zhao, L Zhang IEEE Transactions on Multimedia 23, 3738-3751, 2020 | 44 | 2020 |
| Hierarchical dynamic image harmonization H Chen, Z Gu, Y Li, J Lan, C Meng, W Wang, H Li Proceedings of the 31st ACM International Conference on Multimedia, 1422-1430, 2023 | 43 | 2023 |
| From pixel to patch: Synthesize context-aware features for zero-shot semantic segmentation Z Gu, S Zhou, L Niu, Z Zhao, L Zhang IEEE Transactions on Neural Networks and Learning Systems 34 (10), 7689-7703, 2022 | 24 | 2022 |
| GUI-G: Gaussian Reward Modeling for GUI Grounding F Tang, Z Gu, Z Lu, X Liu, S Shen, C Meng, W Wang, W Zhang, Y Shen, ... arXiv preprint arXiv:2507.15846, 2025 | 23* | 2025 |
| STC: spatio-temporal contrastive learning for video instance segmentation Z Jiang, Z Gu, J Peng, H Zhou, L Liu, Y Wang, Y Tai, C Wang, L Zhang European Conference on Computer Vision, 539-556, 2022 | 19 | 2022 |
| Ui-venus technical report: Building high-performance ui agents with rft Z Gu, Z Zeng, Z Xu, X Zhou, S Shen, Y Liu, B Zhou, C Meng, T Xia, ... arXiv preprint arXiv:2508.10833, 2025 | 18* | 2025 |
| Multi‐mode neural network for human action recognition H Zhao, W Xue, X Li, Z Gu, L Niu, L Zhang IET Computer Vision 14 (8), 587-596, 2020 | 13 | 2020 |
| PC2: Pseudo-Classification Based Pseudo-Captioning for Noisy Correspondence Learning in Cross-Modal Retrieval Y Duan, Z Gu, Z Ying, L Qi, C Meng, Y Shi Proceedings of the 32nd ACM International Conference on Multimedia, 9397-9406, 2024 | 12 | 2024 |
| Mobile User Interface Element Detection Via Adaptively Prompt Tuning Z Gu, Z Xu, H Chen, J Lan, C Meng, W Wang Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 12 | 2023 |
| Boosting audio-visual zero-shot learning with large language models H Chen, Y Li, Y Hong, Z Huang, Z Xu, Z Gu, J Lan, H Zhu, W Wang arXiv preprint arXiv:2311.12268, 2023 | 9 | 2023 |
| Backpropagation path search on adversarial transferability Z Xu, Z Gu, J Zhang, S Cui, C Meng, W Wang Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 7 | 2023 |
| Conditional prototype rectification prompt learning H Chen, Y Li, Z Huang, Y Hong, Z Xu, Z Gu, J Lan, H Zhu, W Wang IEEE Transactions on Circuits and Systems for Video Technology, 2025 | 6 | 2025 |
| Clothes keypoints localization and attribute recognition via prior knowledge Z Gu, J Zhang, Z Pan, H Zhao, L Zhang 2019 IEEE International Conference on Multimedia and Expo (ICME), 550-555, 2019 | 6 | 2019 |
| E-ant: A large-scale dataset for efficient automatic gui navigation K Wang, T Xia, Z Gu, Y Zhao, S Shen, C Meng, W Wang, K Xu arXiv preprint arXiv:2406.14250, 2024 | 4 | 2024 |
| Segment anything model meets image harmonization H Chen, Y Li, Z Gu, Z Xu, J Lan, H Li ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 3 | 2024 |