Cheng Luo

Cited by

	All	Since 2021
Citations	309	243
h-index	7	7
i10-index	5	5

20172018201920202021202220232024202520262 6 27 31 45 63 41 37 56 1

Public access

View all

4 articles

2 articles

available

not available

Based on funding mandates

Cheng Luo

Other namesLuo Cheng

Caltech

Verified email at caltech.edu - Homepage

long contex llm efficient reasoning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Accelerating low bit-width convolutional neural networks with embedded FPGA L Jiao, C Luo, W Cao, X Zhou, L Wang 2017 27th international conference on field programmable logic and …, 2017	97	2017
Towards efficient deep neural network training by FPGA-based batch-level parallelism C Luo, MK Sit, H Fan, S Liu, W Luk, C Guo Journal of Semiconductors 41 (2), 022403, 2020	76	2020
F-E3D: FPGA-based acceleration of an efficient 3D convolutional neural network for human action recognition H Fan, C Luo, C Zeng, M Ferianc, Z Que, S Liu, X Niu, W Luk 2019 IEEE 30th international conference on Application-specific Systems …, 2019	56	2019
Rna: An accurate residual network accelerator for quantized and reconstructed deep neural networks C Luo, W Cao, L Wang, PHW Leong IEICE Transactions on Information and Systems 102 (5), 1037-1045, 2019	26	2019
Headinfer: Memory-efficient llm inference by head-wise offloading C Luo, Z Cai, H Sun, J Xiao, B Yuan, W Xiao, J Hu, J Zhao, B Chen, ... arXiv preprint arXiv:2502.12574, 2025	11	2025
R-KV: Redundancy-aware KV Cache Compression for Training-Free Reasoning Models Acceleration Z Cai, W Xiao, H Sun, C Luo, Y Zhang, K Wan, Y Li, Y Zhou, LW Chang, ... arXiv preprint arXiv:2505.24133, 2025	8	2025
T3P: Demystifying low-earth orbit satellite broadband S Tiwari, S Bhushan, A Taneja, M Kassem, C Luo, C Zhou, Z He, ... arXiv preprint arXiv:2310.11835, 2023	8	2023
Mini-Sequence Transformers: Optimizing Intermediate Memory for Long Sequences Training C Luo, J Zhao, Z Chen, B Chen, A Anandkumar Advances in Neural Information Processing Systems 37, 97299-97327, 2024	6	2024
Rtp: Rethinking tensor parallelism with memory deduplication C Luo, T Zhong, G Fox arXiv preprint arXiv:2311.01635, 2023	5	2023
Moneo: Monitoring fine-grained metrics nonintrusively in AI infrastructure Y Jiang, Y Xiong, L Qu, CL Luo, C Tian, P Cheng, Y Xiong ACM SIGOPS Operating Systems Review 56 (1), 18-25, 2022	4	2022
Moneo: Non-intrusive Fine-grained Monitor for AI Infrastructure Y Jiang, Y Xiong, L Qu, C Luo, C Tian, P Cheng, Y Xiong ICC 2022-IEEE International Conference on Communications, 2586-2591, 2022	4	2022
Tensor-galore: Memory-efficient training via gradient tensor decomposition RJ George, D Pitt, J Zhao, J Kossaifi, C Luo, Y Tian, A Anandkumar	3	2025
TensorGRaD: Tensor Gradient Robust Decomposition for Memory-Efficient Neural Operator Training S Loeschcke, D Pitt, RJ George, J Zhao, C Luo, Y Tian, J Kossaifi, ... arXiv preprint arXiv:2501.02379, 2025	2	2025
CrossoverScheduler: Overlapping Multiple Distributed Training Applications in a Crossover Manner C Luo, L Qu, Y Miao, P Cheng, Y Xiong arXiv preprint arXiv:2103.07974, 2021	2	2021
MOM: Memory-Efficient Offloaded Mini-Sequence Inference for Long Context Language Models J Zhang, T Zhu, C Luo, A Anandkumar arXiv preprint arXiv:2504.12526, 2025	1	2025
EcoSpa: Efficient Transformer Training with Coupled Sparsity J Xiao, C Luo, L Huang, C Yang, Y Sui, H Phan, X Zang, Y Ying, Z Tang, ... arXiv preprint arXiv:2511.11641, 2025		2025
ASAP 2019 H Fan, C Luo, W Luk

The system can't perform the operation now. Try again later.

Articles 1–17

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by