[go: up one dir, main page]

Follow
Song Han
Title
Cited by
Cited by
Year
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
S Han, H Mao, WJ Dally
International Conference on Learning Representations (ICLR'16 best paper award), 2015
132792015
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5MB model size
FN Iandola, S Han, MW Moskewicz, K Ashraf, WJ Dally, K Keutzer
arXiv preprint arXiv:1602.07360, 2016
121512016
Learning both Weights and Connections for Efficient Neural Network
S Han, J Pool, J Tran, W Dally
Advances in Neural Information Processing Systems (NIPS), 1135-1143, 2015
99502015
EIE: Efficient Inference Engine on Compressed Deep Neural Network
S Han, X Liu, H Mao, J Pu, A Pedram, MA Horowitz, WJ Dally
International Symposium on Computer Architecture (ISCA 2016), 2016
36452016
Deep leakage from gradients
L Zhu, Z Liu, S Han
Advances in neural information processing systems 32, 2019
35822019
TSM: Temporal shift module for efficient video understanding
J Lin, C Gan, S Han
Proceedings of the IEEE International Conference on Computer Vision, 7083-7093, 2019
28142019
ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
H Cai, L Zhu, S Han
International Conference on Learning Representations (ICLR) 2019, 2018
26192018
AMC: Automl for model compression and acceleration on mobile devices
Y He, J Lin, Z Liu, H Wang, LJ Li, S Han
Proceedings of the European Conference on Computer Vision (ECCV), 784-800, 2018
19982018
Deep gradient compression: Reducing the communication bandwidth for distributed training
Y Lin, S Han, H Mao, Y Wang, WJ Dally
International Conference on Learning Representations (ICLR) 2018, 2017
19772017
Once-for-all: Train one network and specialize it for efficient deployment
H Cai, C Gan, T Wang, Z Zhang, S Han
International Conference on Learning Representations (ICLR) 2020, 2019
18692019
AWQ: Activation-aware weight quantization for on-device llm compression and acceleration
J Lin, J Tang, H Tang, S Yang, WM Chen, WC Wang, G Xiao, X Dang, ...
Proceedings of machine learning and systems (MLSys'24), best paper award 6 …, 2024
17212024
Smoothquant: Accurate and efficient post-training quantization for large language models
G Xiao, J Lin, M Seznec, H Wu, J Demouth, S Han
International conference on machine learning, 38087-38099, 2023
17032023
Bevfusion: Multi-task multi-sensor fusion with unified bird's-eye view representation
Z Liu, H Tang, A Amini, X Yang, H Mao, D Rus, S Han
arXiv preprint arXiv:2205.13542, 2022
15692022
Trained Ternary Quantization
C Zhu, S Han, H Mao, WJ Dally
International Conference on Learning Representations (ICLR) 2017, 2016
14632016
HAQ: Hardware-aware automated quantization with mixed precision
K Wang, Z Liu, Y Lin, J Lin, S Han
Proceedings of the IEEE conference on computer vision and pattern …, 2019
14522019
Model compression and hardware acceleration for neural networks: A comprehensive survey
L Deng, G Li, S Han, L Shi, Y Xie
Proceedings of the IEEE 108 (4), 485-532, 2020
12392020
Efficient streaming language models with attention sinks
G Xiao, Y Tian, B Chen, S Han, M Lewis
ICLR'24, 2023
12062023
Point-voxel cnn for efficient 3d deep learning
Z Liu, H Tang, Y Lin, S Han
Advances in neural information processing systems 32, 2019
10132019
Searching efficient 3d architectures with sparse point-voxel convolution
H Tang, Z Liu, S Zhao, Y Lin, J Lin, H Wang, S Han
European conference on computer vision, 685-702, 2020
9582020
ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA.
S Han, J Kang, H Mao, Y Hu, X Li, Y Li, D Xie, H Luo, S Yao, Y Wang, ...
International Symposium on Field-Programmable Gate Arrays (FPGA'17), 75-84, 2017
9322017
The system can't perform the operation now. Try again later.
Articles 1–20