Guyue Huang

Cited by

	All	Since 2021
Citations	1010	1006
h-index	11	11
i10-index	11	11

380

190

285

20212022202320242025202634 112 206 271 375 5

Public access

View all

11 articles

1 article

available

not available

Based on funding mandates

Co-authors

Yu Wang (汪玉)Department of Electronic Engineering, Tsinghua University, ChinaVerified email at mail.tsinghua.edu.cn
Yufei DingUniversity of California, San DiegoVerified email at ucsd.edu
Guohao DaiAssociate Professor of Shanghai Jiao Tong UniversityVerified email at sjtu.edu.cn
Yuan XieChair Professor of Hong Kong University of Science and Technology (HKUST)Verified email at ust.hk
Hengrui ZhangPrinceton UniversityVerified email at princeton.edu
Bei YuProfessor, The Chinese University of Hong KongVerified email at cse.cuhk.edu.hk
Boyuan FengPh.D.@UCSB; SWE@PyTorchVerified email at ucsb.edu
Zheng WangPh.D. student, University of California, San DiegoVerified email at ucsd.edu
Zhongming YuUniversity of California, San DiegoVerified email at ucsd.edu

Guyue Huang

NVIDIA

Verified email at nvidia.com - Homepage

computer architecture deep learning system


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Machine learning for electronic design automation: A survey G Huang, J Hu, Y He, J Liu, M Ma, Z Shen, J Wu, Y Xu, H Zhang, K Zhong, ... ACM Transactions on Design Automation of Electronic Systems (TODAES) 26 (5 …, 2021	422	2021
GE-SpMM: General-purpose Sparse Matrix-Matrix Multiplication on GPUs for Graph Neural Networks G Huang, G Dai, W Yu, Y Huazhong Proceedings of the International Conference for High Performance Computing …, 2020	198	2020
{TC-GNN}: Bridging Sparse {GNN} Computation and Dense Tensor Cores on {GPUs} Y Wang, B Feng, Z Wang, G Huang, Y Ding 2023 USENIX Annual Technical Conference (USENIX ATC 23), 149-164, 2023	90	2023
Understanding GNN Computational Graph: A Coordinated Computation, IO, and Memory Perspective H Zhang, Z Yu, G Dai, G Huang, Y Ding, Y Xie, Y Wang Proceedings of Machine Learning and Systems 4, 467-484, 2022	81	2022
Lightseq2: Accelerated training for transformer-based models on gpus X Wang, Y Wei, Y Xiong, G Huang, X Qian, Y Ding, M Wang, L Li SC22: International Conference for High Performance Computing, Networking …, 2022	49	2022
Mixdq: Memory-efficient few-step text-to-image diffusion models with metric-decoupled mixed precision quantization T Zhao, X Ning, T Fang, E Liu, G Huang, Z Lin, S Yan, G Dai, Y Wang European Conference on Computer Vision, 285-302, 2024	35	2024
Heuristic adaptability to input dynamics for spmm on gpus G Dai, G Huang, S Yang, Z Yu, H Zhang, Y Ding, Y Xie, H Yang, Y Wang Proceedings of the 59th ACM/IEEE Design Automation Conference, 595-600, 2022	35	2022
Alcop: Automatic load-compute pipelining in deep learning compiler for ai-gpus G Huang, Y Bai, L Liu, Y Wang, B Yu, Y Ding, Y Xie Proceedings of Machine Learning and Systems 5, 680-694, 2023	30	2023
Shfl-BW: accelerating deep neural network inference with tensor-core aware weight pruning G Huang, H Li, M Qin, F Sun, Y Ding, Y Xie Proceedings of the 59th ACM/IEEE Design Automation Conference, 1153-1158, 2022	22	2022
RM-STC: Row-Merge Dataflow Inspired GPU Sparse Tensor Core for Energy-Efficient Sparse Acceleration G Huang, Z Wang, PA Tsai, C Zhang, Y Ding, Y Xie Proceedings of the 56th Annual IEEE/ACM International Symposium on …, 2023	20	2023
Exploiting Online Locality and Reduction Parallelism for Sampled Dense Matrix Multiplication on GPUs Z Yu, G Dai, G Huang, Y Wang, H Yang 2021 IEEE 39th International Conference on Computer Design (ICCD), 567-574, 2021	14	2021
{OPER}:{Optimality-Guided} Embedding Table Parallelization for Large-scale Recommendation Model Z Wang, Y Wang, B Feng, G Huang, D Mudigere, B Muthiah, A Li, Y Ding 2024 USENIX Annual Technical Conference (USENIX ATC 24), 667-682, 2024	5	2024
Efficient Sparse Matrix Kernels based on Adaptive Workload-Balancing and Parallel-Reduction G Huang, G Dai, Y Wang, Y Ding, Y Xie arXiv preprint arXiv:2106.16064, 2021	5	2021
Enabling Efficient Sparse Multiplications on GPUs with Heuristic Adaptability J Xu, S Huang, J Li, G Huang, Y Xie, Y Wang, G Dai IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2024	3	2024
Low-power in-pixel buffer circuit for smart image sensor Q Li, H Zhu, G Huang, Z Yu, F Qiao, Q Wei, X Liu, H Yang Sensor Review 40 (5), 585-590, 2020	1	2020
TRACI: Network Acceleration of Input-Dynamic Communication for Large-Scale Deep Learning Recommendation Model G Huang, H Li, L Qin, J Huang, Y Kang, Y Ding, Y Xie Proceedings of the 52nd Annual International Symposium on Computer …, 2025		2025
{GMI-DRL}: Empowering {Multi-GPU}{DRL} with {Adaptive-Grained} Parallelism Y Wang, B Feng, Z Wang, G Huang, TT Geng, A Li, Y Ding 2025 USENIX Annual Technical Conference (USENIX ATC 25), 89-103, 2025		2025
Warp execution method and associated GPU Y Gao, F Sun, H Li, G Huang, C Zhang, R Zhong US Patent 12,100,064, 2024		2024
High-Performance Deep Learning Systems via DL Sparsity and DL Compiler G Huang University of California, Santa Barbara, 2024		2024
Systems and methods for neural network training with weight sparsity F Sun, M Qin, H Li, G Zhu, Y Gao, G Huang, Y Zhang US Patent App. 17/866,194, 2023		2023

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors