| Accelerating Multi-Scalar Multiplication for Efficient Zero Knowledge Proofs with Multi-GPU Systems Z Ji, Z Zhang, J Xu, L Ju Proceedings of the 29th ACM International Conference on Architectural …, 2024 | 17 | 2024 |
| Accelerating DBSCAN algorithm with AI chips for large datasets Z Ji, CL Wang Proceedings of the 50th International Conference on Parallel Processing, 1-11, 2021 | 16 | 2021 |
| Efficient exact K-nearest neighbor graph construction for billion-scale datasets using GPUs with tensor cores Z Ji, CL Wang Proceedings of the 36th ACM International Conference on Supercomputing, 1-12, 2022 | 15 | 2022 |
| Momentum-driven adaptive synchronization model for distributed DNN training on HPC clusters Z Zhang, Z Ji, C Wang Journal of Parallel and Distributed Computing 159, 65-84, 2022 | 11 | 2022 |
| Embedding Communication for Federated Graph Neural Networks with Privacy Guarantees X Wu, Z Ji, CL Wang 2023 IEEE 43rd International Conference on Distributed Computing Systems …, 2023 | 9 | 2023 |
| Hg-caffe: Mobile and embedded neural network gpu (opencl) inference engine with fp16 supporting Z Ji arXiv preprint arXiv:1901.00858, 2019 | 8 | 2019 |
| FedCSpc: A Cross-Silo Federated Learning System with Error-Bounded Lossy Parameter Compression Z Zhang, S Di, K Zhao, S Jin, D Tao, Z Ji, B Liu, KA Alharthi, J Cao, ... IEEE Transactions on Parallel and Distributed Systems, 2025 | 6 | 2025 |
| Collaborative gpu preemption via spatial multitasking for efficient gpu sharing Z Ji, CL Wang European Conference on Parallel Processing, 89-104, 2021 | 5 | 2021 |
| CTXBack: Enabling Low Latency GPU Context Switching via Context Flashback Z Ji, CL Wang 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2021 | 5 | 2021 |
| ILP-M Conv: Optimize Convolution Algorithm for Single-Image Convolution Neural Network Inference on Mobile GPUs Z Ji arXiv preprint arXiv:1909.02765, 2019 | 5 | 2019 |
| FedEFsz: Fair Cross-Silo Federated Learning System with Error-Bounded Lossy Compression Z Zhang, S Di, B Liu, Z Ji, G Li, X Lu, AC Zhou, KA Alharthi, J Cao IEEE Transactions on Parallel and Distributed Systems, 2025 | 4 | 2025 |
| A Compiler-Like Framework for Optimizing Cryptographic Big Integer Multiplication on GPUs Z Ji, J Zhao, Z Zhang, J Xu, S Yan, L Ju 2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO), 380-392, 2024 | 3 | 2024 |
| Accelerating Number Theoretic Transform with Multi-GPU Systems for Efficient Zero Knowledge Proof Z Ji, J Zhao, P Gao, X Yin, L Ju Proceedings of the 30th ACM International Conference on Architectural …, 2025 | 2 | 2025 |
| Compiler-Directed Incremental Checkpointing for Low Latency GPU Preemption Z Ji, CL Wang 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2022 | 2 | 2022 |
| Cube-fx: Mapping Taylor Expansion Onto Matrix Multiplier-Accumulators of Huawei Ascend AI Processors Y Tang, H Zhou, Z Ji, CL Wang IEEE Transactions on Parallel and Distributed Systems, 2025 | 1 | 2025 |
| VESTA: A Secure and Efficient FHE-based Three-Party Vectorized Evaluation System for Tree Aggregation Models H Zhao, J Huang, Z Chen, K Zhu, D Chen, Z Ji, H Liu Proceedings of the ACM on Measurement and Analysis of Computing Systems 9 (1 …, 2025 | | 2025 |
| POSTER: Accelerating High-Precision Integer Multiplication used in Cryptosystems with GPUs Z Ji, Z Zhang, J Xu, L Ju Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and …, 2024 | | 2024 |
| Optimizing Aggregate Computation of Graph Neural Networks with on-GPU Interpreter-Style Programming Z Ji, CL Wang Proceedings of the International Conference on Parallel Architectures and …, 2022 | | 2022 |
| Efficient machine learning on GPUs: optimizing algorithms and harvesting idle cycles Z Ji HKU Theses Online (HKUTO), 2022 | | 2022 |