[go: up one dir, main page]

Follow
Esha  Choukse
Esha Choukse
Microsoft Research
Verified email at utexas.edu - Homepage
Title
Cited by
Cited by
Year
Splitwise: Efficient generative llm inference using phase splitting
P Patel, E Choukse, C Zhang, A Shah, Í Goiri, S Maleki, R Bianchini
2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture …, 2024
4742024
Dynamollm: Designing llm inference clusters for performance and energy efficiency
J Stojkovic, C Zhang, Í Goiri, J Torrellas, E Choukse
2025 IEEE International Symposium on High Performance Computer Architecture …, 2025
1482025
Bit-plane compression: Transforming data for better compression in many-core architectures
J Kim, M Sullivan, E Choukse, M Erez
ACM SIGARCH Computer Architecture News 44 (3), 329-340, 2016
1382016
Prunetrain: fast neural network training by dynamic sparse model reconfiguration
S Lym, E Choukse, S Zangeneh, W Wen, S Sanghavi, M Erez
Proceedings of the International Conference for High Performance Computing …, 2019
1152019
Towards greener llms: Bringing energy-efficiency to the forefront of llm inference
J Stojkovic, E Choukse, C Zhang, I Goiri, J Torrellas
arXiv preprint arXiv:2403.20306, 2024
1072024
Characterizing power management opportunities for llms in the cloud
P Patel, E Choukse, C Zhang, Í Goiri, B Warrier, N Mahalingam, ...
Proceedings of the 29th ACM International Conference on Architectural …, 2024
1042024
Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs
E Choukse, M Sullivan, M O'Connor, M Erez, J Pool, D Nellans, S Keckler
47th International Symposium on Computer Architecture (ISCA 2020), 2020
712020
Designing cloud servers for lower carbon
J Wang, DS Berger, F Kazhamiaka, C Irvene, C Zhang, E Choukse, ...
2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture …, 2024
622024
Compresso: Pragmatic main memory compression
E Choukse, M Erez, AR Alameldeen
(MICRO) 51st Annual IEEE/ACM International Symposium on Microarchitecture …, 2018
462018
Making kernel bypass practical for the cloud with junction
J Fried, GI Chaudhry, E Saurez, E Choukse, Í Goiri, S Elnikety, R Fonseca, ...
21st USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2024
452024
Myths and misconceptions around reducing carbon embedded in cloud platforms
J Lyu, J Wang, K Frost, C Zhang, C Irvene, E Choukse, R Fonseca, ...
Proceedings of the 2nd Workshop on Sustainable Computer Systems, 1-7, 2023
362023
Tapas: Thermal-and power-aware scheduling for LLM inference in cloud platforms
J Stojkovic, C Zhang, Í Goiri, E Choukse, H Qiu, R Fonseca, J Torrellas, ...
Proceedings of the 30th ACM International Conference on Architectural …, 2025
342025
Ecoserve: Designing carbon-aware ai inference systems
Y Li, Z Hu, E Choukse, R Fonseca, GE Suh, U Gupta
arXiv preprint arXiv:2502.05043, 2025
272025
Towards improved power management in cloud gpus
P Patel, Z Gong, S Rizvi, E Choukse, P Misra, T Anderson, A Sriraman
IEEE Computer Architecture Letters 22 (2), 141-144, 2023
242023
Polca: Power oversubscription in llm cloud providers
P Patel, E Choukse, C Zhang, Í Goiri, B Warrier, N Mahalingam, ...
arXiv preprint arXiv:2308.12908, 2023
212023
Prunetrain: Gradual structured pruning from scratch for faster neural network training
S Lym, E Choukse, S Zangeneh, W Wen, M Erez, S Shanghavi
CoRR, 2019
172019
Mnemosyne: Parallelization strategies for efficiently serving multi-million context length llm inference requests without approximations
A Agrawal, J Chen, Í Goiri, R Ramjee, C Zhang, A Tumanov, E Choukse
arXiv e-prints, arXiv: 2409.17264, 2024
152024
Overclocking in immersion-cooled datacenters
PA Misra, I Manousakis, E Choukse, M Jalili, Í Goiri, A Raniwala, ...
IEEE Micro 42 (4), 10-17, 2022
132022
Translation-Optimized Memory Compression for Capacity
CJVT Gagandeep Panwar, Muhammad Laghari, David Bears, Yuqing Liu, ...
55th IEEE/ACM International Symposium on Microarchitecture, 2022
13*2022
ModServe: Modality-and Stage-Aware Resource Disaggregation for Scalable Multimodal Model Serving
H Qiu, A Biswas, Z Zhao, J Mohan, A Khare, E Choukse, Í Goiri, Z Zhang, ...
arXiv preprint arXiv:2502.00937, 2025
122025
The system can't perform the operation now. Try again later.
Articles 1–20