[go: up one dir, main page]

Follow
Ting Cao 曹婷
Ting Cao 曹婷
Verified email at mail.tsinghua.edu.cn - Homepage
Title
Cited by
Cited by
Year
Nn-meter: Towards accurate latency prediction of deep-learning model inference on diverse edge devices
LL Zhang, S Han, J Wei, N Zheng, T Cao, Y Yang, Y Liu
Proceedings of the 19th Annual International Conference on Mobile Systems …, 2021
1982021
The yin and yang of power and performance for asymmetric hardware and managed software
T Cao, SM Blackburn, T Gao, KS McKinley
ACM SIGARCH Computer Architecture News 40 (3), 225-236, 2012
1452012
Parallel processing systems for big data: a survey
Y Zhang, T Cao, S Li, X Tian, L Yuan, H Jia, AV Vasilakos
Proceedings of the IEEE 104 (11), 2114-2136, 2016
1282016
Looking back on the language and hardware revolutions: measured power, performance, and scaling
H Esmaeilzadeh, T Cao, Y Xi, SM Blackburn, KS McKinley
ACM SIGARCH Computer Architecture News 39 (1), 319-332, 2011
1152011
Panthera: Holistic memory management for big data processing over hybrid memories
C Wang, H Cui, T Cao, J Zigman, H Volos, O Mutlu, F Lv, X Feng, GH Xu
Proceedings of the 40th ACM SIGPLAN Conference on Programming Language …, 2019
932019
CoDL: efficient CPU-GPU co-execution for deep learning inference on mobile devices.
F Jia, D Zhang, T Cao, S Jiang, Y Liu, J Ren, Y Zhang
MobiSys 22, 209-221, 2022
822022
Asymo: scalable and efficient deep-learning inference on asymmetric mobile cpus
M Wang, S Ding, T Cao, Y Liu, F Xu
Proceedings of the 27th Annual International Conference on Mobile Computing …, 2021
812021
Pre-gated moe: An algorithm-system co-design for fast and scalable mixture-of-expert inference
R Hwang, J Wei, S Cao, C Hwang, X Tang, T Cao, M Yang
2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture …, 2024
732024
Bitdistiller: Unleashing the potential of sub-4-bit llms via self-distillation
D Du, Y Zhang, S Cao, J Guo, T Cao, X Chu, N Xu
arXiv preprint arXiv:2402.10631, 2024
662024
Seerattention: Learning intrinsic sparse attention in your llms
Y Gao, Z Zeng, D Du, S Cao, P Zhou, J Qi, J Lai, HKH So, T Cao, F Yang, ...
arXiv preprint arXiv:2410.13276, 2024
642024
Hybrid slm and llm for edge-cloud collaborative inference
Z Hao, H Jiang, S Jiang, J Ren, T Cao
Proceedings of the Workshop on Edge and Mobile Foundation Models, 36-41, 2024
612024
WADE: Writeback-aware dynamic cache management for NVM-based main memory system
Z Wang, S Shan, T Cao, J Gu, Y Xu, S Mu, Y Xie, DA Jiménez
ACM Transactions on Architecture and Code Optimization (TACO) 10 (4), 1-21, 2013
592013
T-mac: Cpu renaissance via table lookup for low-bit llm deployment on edge
J Wei, S Cao, T Cao, L Ma, L Wang, Y Zhang, M Yang
Proceedings of the Twentieth European Conference on Computer Systems, 278-292, 2025
452025
Looking back and looking forward: power, performance, and upheaval
H Esmaeilzadeh, T Cao, X Yang, SM Blackburn, KS McKinley
Communications of the ACM 55 (7), 105-114, 2012
452012
Ladder: Enabling efficient {Low-Precision} deep learning computing through hardware-aware tensor transformation
L Wang, L Ma, S Cao, Q Zhang, J Xue, Y Shi, N Zheng, Z Miao, F Yang, ...
18th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2024
432024
Integer or floating point? new outlooks for low-bit quantization on large language models
Y Zhang, L Zhao, S Cao, S Zhang, W Wang, T Cao, F Yang, M Yang, ...
2024 IEEE International Conference on Multimedia and Expo (ICME), 1-6, 2024
402024
Flexnn: Efficient and adaptive dnn inference on memory-constrained edge devices
X Li, Y Li, Y Li, T Cao, Y Liu
Proceedings of the 30th Annual International Conference on Mobile Computing …, 2024
402024
Pre-gated moe: An algorithm-system co-design for fast and scalable mixture-of-expert inference. In 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)
R Hwang, J Wei, S Cao, C Hwang, X Tang, T Cao, M Yang
IEEE 2, 1018-1031, 2024
402024
Lut-nn: Empower efficient neural network inference with centroid learning and table lookup
X Tang, Y Wang, T Cao, LL Zhang, Q Chen, D Cai, Y Liu, M Yang
Proceedings of the 29th Annual International Conference on Mobile Computing …, 2023
362023
Vptq: Extreme low-bit vector post-training quantization for large language models
Y Liu, J Wen, Y Wang, S Ye, LL Zhang, T Cao, C Li, M Yang
arXiv preprint arXiv:2409.17066, 2024
342024
The system can't perform the operation now. Try again later.
Articles 1–20