| Splitwise: Efficient generative llm inference using phase splitting P Patel, E Choukse, C Zhang, A Shah, Í Goiri, S Maleki, R Bianchini 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture …, 2024 | 474 | 2024 |
| Dynamollm: Designing llm inference clusters for performance and energy efficiency J Stojkovic, C Zhang, Í Goiri, J Torrellas, E Choukse 2025 IEEE International Symposium on High Performance Computer Architecture …, 2025 | 148 | 2025 |
| Bit-plane compression: Transforming data for better compression in many-core architectures J Kim, M Sullivan, E Choukse, M Erez ACM SIGARCH Computer Architecture News 44 (3), 329-340, 2016 | 138 | 2016 |
| Prunetrain: fast neural network training by dynamic sparse model reconfiguration S Lym, E Choukse, S Zangeneh, W Wen, S Sanghavi, M Erez Proceedings of the International Conference for High Performance Computing …, 2019 | 115 | 2019 |
| Towards greener llms: Bringing energy-efficiency to the forefront of llm inference J Stojkovic, E Choukse, C Zhang, I Goiri, J Torrellas arXiv preprint arXiv:2403.20306, 2024 | 107 | 2024 |
| Characterizing power management opportunities for llms in the cloud P Patel, E Choukse, C Zhang, Í Goiri, B Warrier, N Mahalingam, ... Proceedings of the 29th ACM International Conference on Architectural …, 2024 | 104 | 2024 |
| Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs E Choukse, M Sullivan, M O'Connor, M Erez, J Pool, D Nellans, S Keckler 47th International Symposium on Computer Architecture (ISCA 2020), 2020 | 71 | 2020 |
| Designing cloud servers for lower carbon J Wang, DS Berger, F Kazhamiaka, C Irvene, C Zhang, E Choukse, ... 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture …, 2024 | 62 | 2024 |
| Compresso: Pragmatic main memory compression E Choukse, M Erez, AR Alameldeen (MICRO) 51st Annual IEEE/ACM International Symposium on Microarchitecture …, 2018 | 46 | 2018 |
| Making kernel bypass practical for the cloud with junction J Fried, GI Chaudhry, E Saurez, E Choukse, Í Goiri, S Elnikety, R Fonseca, ... 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2024 | 45 | 2024 |
| Myths and misconceptions around reducing carbon embedded in cloud platforms J Lyu, J Wang, K Frost, C Zhang, C Irvene, E Choukse, R Fonseca, ... Proceedings of the 2nd Workshop on Sustainable Computer Systems, 1-7, 2023 | 36 | 2023 |
| Tapas: Thermal-and power-aware scheduling for LLM inference in cloud platforms J Stojkovic, C Zhang, Í Goiri, E Choukse, H Qiu, R Fonseca, J Torrellas, ... Proceedings of the 30th ACM International Conference on Architectural …, 2025 | 34 | 2025 |
| Ecoserve: Designing carbon-aware ai inference systems Y Li, Z Hu, E Choukse, R Fonseca, GE Suh, U Gupta arXiv preprint arXiv:2502.05043, 2025 | 27 | 2025 |
| Towards improved power management in cloud gpus P Patel, Z Gong, S Rizvi, E Choukse, P Misra, T Anderson, A Sriraman IEEE Computer Architecture Letters 22 (2), 141-144, 2023 | 24 | 2023 |
| Polca: Power oversubscription in llm cloud providers P Patel, E Choukse, C Zhang, Í Goiri, B Warrier, N Mahalingam, ... arXiv preprint arXiv:2308.12908, 2023 | 21 | 2023 |
| Prunetrain: Gradual structured pruning from scratch for faster neural network training S Lym, E Choukse, S Zangeneh, W Wen, M Erez, S Shanghavi CoRR, 2019 | 17 | 2019 |
| Mnemosyne: Parallelization strategies for efficiently serving multi-million context length llm inference requests without approximations A Agrawal, J Chen, Í Goiri, R Ramjee, C Zhang, A Tumanov, E Choukse arXiv e-prints, arXiv: 2409.17264, 2024 | 15 | 2024 |
| Overclocking in immersion-cooled datacenters PA Misra, I Manousakis, E Choukse, M Jalili, Í Goiri, A Raniwala, ... IEEE Micro 42 (4), 10-17, 2022 | 13 | 2022 |
| Translation-Optimized Memory Compression for Capacity CJVT Gagandeep Panwar, Muhammad Laghari, David Bears, Yuqing Liu, ... 55th IEEE/ACM International Symposium on Microarchitecture, 2022 | 13* | 2022 |
| ModServe: Modality-and Stage-Aware Resource Disaggregation for Scalable Multimodal Model Serving H Qiu, A Biswas, Z Zhao, J Mohan, A Khare, E Choukse, Í Goiri, Z Zhang, ... arXiv preprint arXiv:2502.00937, 2025 | 12 | 2025 |