Esha Choukse

Cited by

	All	Since 2021
Citations	1639	1538
h-index	16	15
i10-index	23	22

1000

500

250

750

201720182019202020212022202320242025202611 18 37 32 57 66 71 318 995 30

Public access

View all

12 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Mattan ErezThe University of Texas at AustinVerified email at utexas.edu
Michael B SullivanSenior Research Scientist, NVIDIAVerified email at utexas.edu
Wei WenResearch Scientist, AI at MetaVerified email at fb.com
Sangkug LymNvidiaVerified email at utexas.edu
Jungrae KimSungkyunkwan UniversityVerified email at skku.edu
Alaa R. AlameldeenSimon Fraser UniversityVerified email at cs.sfu.ca
Sujay SanghaviProfessor, Electrical and Computer Engineering, University of Texas, AustinVerified email at mail.utexas.edu
Mike O'ConnorNVIDIA ResearchVerified email at nvidia.com
David NellansSenior Director @ NVIDIA ResearchVerified email at nellans.org
Jeff PoolSenior Architect, NVIDIAVerified email at nvidia.com
Steve KecklerVice President of Architecture Research, NVIDIAVerified email at cs.utexas.edu
Vivek Joy KozhikkottuResearch Scientist, Intel CorpVerified email at intel.com

Esha Choukse

Microsoft Research

Verified email at utexas.edu - Homepage

Computer Architecture Systems


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Splitwise: Efficient generative llm inference using phase splitting P Patel, E Choukse, C Zhang, A Shah, Í Goiri, S Maleki, R Bianchini 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture …, 2024	474	2024
Dynamollm: Designing llm inference clusters for performance and energy efficiency J Stojkovic, C Zhang, Í Goiri, J Torrellas, E Choukse 2025 IEEE International Symposium on High Performance Computer Architecture …, 2025	148	2025
Bit-plane compression: Transforming data for better compression in many-core architectures J Kim, M Sullivan, E Choukse, M Erez ACM SIGARCH Computer Architecture News 44 (3), 329-340, 2016	138	2016
Prunetrain: fast neural network training by dynamic sparse model reconfiguration S Lym, E Choukse, S Zangeneh, W Wen, S Sanghavi, M Erez Proceedings of the International Conference for High Performance Computing …, 2019	115	2019
Towards greener llms: Bringing energy-efficiency to the forefront of llm inference J Stojkovic, E Choukse, C Zhang, I Goiri, J Torrellas arXiv preprint arXiv:2403.20306, 2024	107	2024
Characterizing power management opportunities for llms in the cloud P Patel, E Choukse, C Zhang, Í Goiri, B Warrier, N Mahalingam, ... Proceedings of the 29th ACM International Conference on Architectural …, 2024	104	2024
Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs E Choukse, M Sullivan, M O'Connor, M Erez, J Pool, D Nellans, S Keckler 47th International Symposium on Computer Architecture (ISCA 2020), 2020	71	2020
Designing cloud servers for lower carbon J Wang, DS Berger, F Kazhamiaka, C Irvene, C Zhang, E Choukse, ... 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture …, 2024	62	2024
Compresso: Pragmatic main memory compression E Choukse, M Erez, AR Alameldeen (MICRO) 51st Annual IEEE/ACM International Symposium on Microarchitecture …, 2018	46	2018
Making kernel bypass practical for the cloud with junction J Fried, GI Chaudhry, E Saurez, E Choukse, Í Goiri, S Elnikety, R Fonseca, ... 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2024	45	2024
Myths and misconceptions around reducing carbon embedded in cloud platforms J Lyu, J Wang, K Frost, C Zhang, C Irvene, E Choukse, R Fonseca, ... Proceedings of the 2nd Workshop on Sustainable Computer Systems, 1-7, 2023	36	2023
Tapas: Thermal-and power-aware scheduling for LLM inference in cloud platforms J Stojkovic, C Zhang, Í Goiri, E Choukse, H Qiu, R Fonseca, J Torrellas, ... Proceedings of the 30th ACM International Conference on Architectural …, 2025	34	2025
Ecoserve: Designing carbon-aware ai inference systems Y Li, Z Hu, E Choukse, R Fonseca, GE Suh, U Gupta arXiv preprint arXiv:2502.05043, 2025	27	2025
Towards improved power management in cloud gpus P Patel, Z Gong, S Rizvi, E Choukse, P Misra, T Anderson, A Sriraman IEEE Computer Architecture Letters 22 (2), 141-144, 2023	24	2023
Polca: Power oversubscription in llm cloud providers P Patel, E Choukse, C Zhang, Í Goiri, B Warrier, N Mahalingam, ... arXiv preprint arXiv:2308.12908, 2023	21	2023
Prunetrain: Gradual structured pruning from scratch for faster neural network training S Lym, E Choukse, S Zangeneh, W Wen, M Erez, S Shanghavi CoRR, 2019	17	2019
Mnemosyne: Parallelization strategies for efficiently serving multi-million context length llm inference requests without approximations A Agrawal, J Chen, Í Goiri, R Ramjee, C Zhang, A Tumanov, E Choukse arXiv e-prints, arXiv: 2409.17264, 2024	15	2024
Overclocking in immersion-cooled datacenters PA Misra, I Manousakis, E Choukse, M Jalili, Í Goiri, A Raniwala, ... IEEE Micro 42 (4), 10-17, 2022	13	2022
Translation-Optimized Memory Compression for Capacity CJVT Gagandeep Panwar, Muhammad Laghari, David Bears, Yuqing Liu, ... 55th IEEE/ACM International Symposium on Microarchitecture, 2022	13*	2022
ModServe: Modality-and Stage-Aware Resource Disaggregation for Scalable Multimodal Model Serving H Qiu, A Biswas, Z Zhao, J Mohan, A Khare, E Choukse, Í Goiri, Z Zhang, ... arXiv preprint arXiv:2502.00937, 2025	12	2025

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors