| Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon X Dong, S Chen, SJ Pan Advances in Neural Information Processing Systems 30 (NIPS 2017) pre-proceedings, 2017 | 672 | 2017 |
| Additive powers-of-two quantization: An efficient non-uniform discretization for neural networks Y Li*, X Dong*, W Wang International Conference on Learning Representations, 2019 | 436 | 2019 |
| A free lunch from ANN: Towards efficient, accurate spiking neural networks calibration Y Li, S Deng, X Dong, R Gong, S Gu International conference on machine learning, 6316-6325, 2021 | 286 | 2021 |
| Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit? S Zhu, X Dong, H Su 2019 Conference on Computer Vision and Pattern Recognition, 2019 | 186 | 2019 |
| Small Language Models are the Future of Agentic AI P Belcak, G Heinrich, S Diao, Y Fu, X Dong, S Muralidharan, YC Lin, ... arXiv preprint arXiv:2506.02153, 2025 | 165* | 2025 |
| exBERT: Extending pre-trained models with domain-specific vocabulary under constrained training resources W Tai, HT Kung, X Dong, M Comiter, CF Kuo Findings of the Association for Computational Linguistics: EMNLP 2020, 1433-1439, 2020 | 147 | 2020 |
| Prorl: Prolonged reinforcement learning expands reasoning boundaries in large language models M Liu, S Diao, X Lu, J Hu, X Dong, Y Choi, J Kautz, Y Dong arXiv preprint arXiv:2505.24864, 2025 | 113 | 2025 |
| Hymba: A hybrid-head architecture for small language models X Dong, Y Fu, S Diao, W Byeon, Z Chen, AS Mahabaleshwarkar, SY Liu, ... The Thirteenth International Conference on Learning Representations, 2024 | 84* | 2024 |
| Neural Mean Discrepancy for Efficient Out-of-Distribution Detection X Dong, J Guo, A Li, WT Ting, C Liu, HT Kung Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022 | 83 | 2022 |
| Mixmix: All you need for data-free compression are feature and data mixing Y Li, F Zhu, R Gong, M Shen, X Dong, F Yu, S Lu, S Gu Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021 | 53 | 2021 |
| Full-stack optimization for accelerating cnns using powers-of-two weights with fpga validation B McDanel, SQ Zhang, HT Kung, X Dong Proceedings of the ACM International Conference on Supercomputing, 449-460, 2019 | 53 | 2019 |
| A Main/Subsidiary Network Framework for Simplifying Binary Neural Network Y Xu*, X Dong*, Y Li, H Su 2019 Conference on Computer Vision and Pattern Recognition, 2019 | 45 | 2019 |
| Spherefed: Hyperspherical federated learning X Dong, SQ Zhang, A Li, HT Kung European Conference on Computer Vision, 165-184, 2022 | 42 | 2022 |
| RTN: Reparameterized ternary network Y Li*, X Dong*, SQ Zhang, H Bai, Y Chen, W Wang Proceedings of the AAAI Conference on Artificial Intelligence 34 (04), 4780-4787, 2020 | 41 | 2020 |
| Is heterogeneity notorious? taming heterogeneity to handle test-time shift in federated learning Y Tan, C Chen, W Zhuang, X Dong, L Lyu, G Long Advances in neural information processing systems 36, 27167-27180, 2023 | 40 | 2023 |
| SplitNets: Designing Neural Architectures for Efficient Distributed Computing on Head-Mounted Systems X Dong, B De Salvo, M Li, C Liu, Z Qu, HT Kung, Z Li Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022 | 37 | 2022 |
| Maestro: A memory-on-logic architecture for coordinated parallel use of many systolic arrays HT Kung, B McDanel, SQ Zhang, X Dong, CC Chen 2019 IEEE 30th International Conference on Application-specific Systems …, 2019 | 36 | 2019 |
| Converting artificial neural networks to spiking neural networks via parameter calibration Y Li, S Deng, X Dong, S Gu arXiv preprint arXiv:2205.10121, 2022 | 35 | 2022 |
| Efficient bitwidth search for practical mixed precision neural network Y Li, W Wang, H Bai, R Gong, X Dong, F Yu arXiv preprint arXiv:2003.07577, 2020 | 27 | 2020 |
| Segment every out-of-distribution object W Zhao, J Li, X Dong, Y Xiang, Y Guo Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 24 | 2024 |