| Motion-i2v: Consistent and controllable image-to-video generation with explicit motion modeling X Shi, Z Huang, FY Wang, W Bian, D Li, Y Zhang, M Zhang, KC Cheung, ... ACM SIGGRAPH 2024 Conference Papers, 1-11, 2024 | 178 | 2024 |
| Flowformer++: Masked cost volume autoencoding for pretraining optical flow estimation X Shi, Z Huang, D Li, M Zhang, KC Cheung, S See, H Qin, J Dai, H Li Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2023 | 171 | 2023 |
| Videoflow: Exploiting temporal cues for multi-frame optical flow estimation X Shi, Z Huang, W Bian, D Li, M Zhang, KC Cheung, S See, H Qin, J Dai, ... Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 151 | 2023 |
| Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step Z Guo, R Zhang, C Tong, Z Zhao, R Huang, H Zhang, M Zhang, J Liu, ... arXiv preprint arXiv:2501.13926, 2025 | 106 | 2025 |
| Lumina-image 2.0: A unified and efficient image generative framework Q Qin, L Zhuo, Y Xin, R Du, Z Li, B Fu, Y Lu, J Yuan, X Li, D Liu, X Zhu, ... arXiv preprint arXiv:2503.21758, 2025 | 52* | 2025 |
| Deep reward supervisions for tuning text-to-image diffusion models X Wu, Y Hao, M Zhang, K Sun, Z Huang, G Song, Y Liu, H Li European Conference on Computer Vision, 108-124, 2024 | 36 | 2024 |
| Decoupled detr: Spatially disentangling localization and classification for improved end-to-end object detection M Zhang, G Song, Y Liu, H Li Proceedings of the IEEE/CVF international conference on computer vision …, 2023 | 36 | 2023 |
| Discriminability distillation in group representation learning M Zhang, G Song, H Zhou, Y Liu European Conference on Computer Vision, 1-19, 2020 | 27 | 2020 |
| Towards flops-constrained face recognition JY Yu Liu, Guanglu Song, Manyuan Zhang, Jihao Liu, Yucong Zhou Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019 | 21 | 2019 |
| Longcat-flash technical report MLC Team, B Li, B Lei, B Wang, B Rong, C Wang, C Zhang, C Gao, ... arXiv preprint arXiv:2509.01322, 2025 | 12 | 2025 |
| DI-drive: OpenDILab decision intelligence platform for autonomous driving simulation D Drive Contributors | 10 | 2021 |
| Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark Z Guo, X Chen, R Zhang, R An, Y Qi, D Jiang, X Li, M Zhang, H Li, ... arXiv preprint arXiv:2510.26802, 2025 | 8* | 2025 |
| Switchable k-class hyperplanes for noise-robust representation learning B Liu, G Song, M Zhang, H You, Y Liu Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021 | 7 | 2021 |
| Deep learning for computational science and engineering J Adie, J Yang, M Zhang, S See GPU technology conference, 4-7, 2018 | 7 | 2018 |
| Think with 3d: Geometric imagination grounded spatial reasoning from limited views Z Chen, M Zhang, X Yu, X Luo, M Sun, Z Pan, Y Feng, P Pei, X Cai, ... arXiv preprint arXiv:2510.18632, 2025 | 6 | 2025 |
| 1st place solution for ava-kinetics crossover in acitivitynet challenge 2020 S Chen, J Pan, G Song, M Zhang, H Shao, Z Lin, J Shao, H Li, Y Liu arXiv preprint arXiv:2006.09116, 2020 | 6 | 2020 |
| Three things we need to know about transferring stable diffusion to visual dense prediction tasks M Zhang, G Song, X Shi, Y Liu, H Li European Conference on Computer Vision, 128-145, 2024 | 5 | 2024 |
| Towards robust face recognition with comprehensive search M Zhang, G Song, Y Liu, H Li European Conference on Computer Vision, 720-736, 2022 | 5 | 2022 |
| Tensor sensing for RF tomographic imaging T Deng, F Qian, XY Liu, M Zhang, A Walid 2018 IEEE International Conference on Multimedia and Expo (ICME), 1-6, 2018 | 5 | 2018 |
| Codeplot-cot: Mathematical visual reasoning by thinking with code-driven images C Duan, K Sun, R Fang, M Zhang, Y Feng, Y Luo, Y Liu, K Wang, P Pei, ... arXiv preprint arXiv:2510.11718, 2025 | 3 | 2025 |