| HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction Y Liu, Y Liu, C Jiang, K Lyu, W Wan, H Shen, B Liang, Z Fu, H Wang, L Yi Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 289 | 2022 |
| Or-nerf: Object removing from 3d scenes guided by multiview segmentation with neural radiance fields Y Yin, Z Fu, F Yang, G Lin arXiv preprint arXiv:2305.10503, 2023 | 42 | 2023 |
| Sculpt3d: Multi-view consistent text-to-3d generation with sparse 3d prior C Chen, X Yang, F Yang, C Feng, Z Fu, CS Foo, G Lin, F Liu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 30 | 2024 |
| Sync4d: Video guided controllable dynamics for physics-based 4d generation Z Fu, J Wei, W Shen, C Song, X Yang, F Liu, X Yang, G Lin arXiv preprint arXiv:2405.16849, 2024 | 17 | 2024 |
| DeepVerse: 4D Autoregressive Video Generation as a World Model J Chen, H Zhu, X He, Y Wang, J Zhou, W Chang, Y Zhou, Z Li, Z Fu, ... arXiv preprint arXiv:2506.01103, 2025 | 14 | 2025 |
| Omniworld: A multi-domain and multi-modal dataset for 4d world modeling Y Zhou, Y Wang, J Zhou, W Chang, H Guo, Z Li, K Ma, X Li, Y Wang, ... arXiv preprint arXiv:2509.12201, 2025 | 4 | 2025 |
| In-context learning with unpaired clips for instruction-based video editing X Liao, X Zeng, Z Song, Z Fu, G Yu, G Lin arXiv preprint arXiv:2510.14648, 2025 | 2 | 2025 |
| iMontage: Unified, Versatile, Highly Dynamic Many-to-many Image Generation Z Fu, X Zeng, J Lan, X Liao, C Chen, J Chen, J Wei, W Cheng, S Liu, ... arXiv preprint arXiv:2511.20635, 2025 | 1 | 2025 |
| VINO: A Unified Visual Generator with Interleaved OmniModal Context J Chen, T He, Z Fu, P Wan, K Gai, W Ye arXiv preprint arXiv:2601.02358, 2026 | | 2026 |
| PhyCustom: Towards Realistic Physical Customization in Text-to-Image Generation F Wu, C Chen, Z Fu, J Wei, Y Xu, D Ye, G Lin arXiv preprint arXiv:2512.02794, 2025 | | 2025 |