[go: up one dir, main page]

Follow
Mohamed Assem Ibrahim
Mohamed Assem Ibrahim
AMD Research and Advanced Development (RAD)
Verified email at amd.com - Homepage
Title
Cited by
Cited by
Year
Controlled kernel launch for dynamic parallelism in GPUs
X Tang, A Pattnaik, H Jiang, O Kayiran, A Jog, S Pai, M Ibrahim, ...
2017 IEEE International Symposium on High Performance Computer Architecture …, 2017
672017
Efficient and fair multi-programming in GPUs via effective bandwidth management
H Wang, F Luo, M Ibrahim, O Kayiran, A Jog
2018 IEEE International Symposium on High Performance Computer Architecture …, 2018
582018
Architectural support for efficient large-scale automata processing
H Liu, M Ibrahim, O Kayiran, S Pai, A Jog
2018 51st Annual IEEE/ACM International Symposium on Microarchitecture …, 2018
322018
Analyzing and leveraging decoupled L1 caches in GPUs
MA Ibrahim, O Kayiran, Y Eckert, GH Loh, A Jog
2021 IEEE International Symposium on High-Performance Computer Architecture …, 2021
282021
Analyzing and leveraging remote-core bandwidth for enhanced performance in GPUs
MA Ibrahim, H Liu, O Kayiran, A Jog
2019 28th International Conference on Parallel Architectures and Compilation …, 2019
262019
Analyzing and leveraging shared L1 caches in GPUs
MA Ibrahim, O Kayiran, Y Eckert, GH Loh, A Jog
Proceedings of the ACM International Conference on Parallel Architectures …, 2020
222020
Proactive scheduling for content pre-fetching in mobile networks
O Shoukry, M Abd ElMohsen, J Tadrous, H El Gamal, T ElBatt, N Wanas, ...
2014 IEEE International Conference on Communications (ICC), 2848-2854, 2014
162014
Efficient Cache Utilization via Model-aware Data Placement for Recommendation Models
MA Ibrahim, O Kayiran, S Aga
Proceedings of the International Symposium on Memory Systems, 1-11, 2021
92021
Address-stride assisted approximate load value prediction in GPUs
H Wang, M Ibrahim, S Mittal, A Jog
Proceedings of the ACM International Conference on Supercomputing, 184-194, 2019
62019
Just-in-time Quantization with Processing-In-Memory for Efficient ML Training
MA Ibrahim, S Aga, A Li, S Pati, M Islam
arXiv preprint arXiv:2311.05034, 2023
52023
Inclusive-PIM: Hardware-Software Co-design for Broad Acceleration on Commercial PIM Architectures
J Alsop, S Aga, M Ibrahim, M Islam, A Mccrabb, N Jayasena
arXiv preprint arXiv:2309.07984, 2023
52023
PIMnast: Balanced Data Placement for GEMV Acceleration with Processing-In-Memory
MA Ibrahim, M Islam, S Aga
SC24-W: Workshops of the International Conference for High Performance …, 2024
42024
Balanced Data Placement for GEMV Acceleration with Processing-In-Memory
MA Ibrahim, M Islam, S Aga
arXiv preprint arXiv:2403.20297, 2024
42024
Collaborative Acceleration for FFT on Commercial Processing-In-Memory Architectures
MA Ibrahim, S Aga
arXiv preprint arXiv:2308.03973, 2023
42023
Distributing Model Data in Memories in Nodes in an Electronic Device
MAAEM Ibrahim, O Kayiran, AGA Shaizeen
US Patent App. 17/489,576, 2023
42023
Pimacolaba: Collaborative Acceleration for FFT on Commercial Processing-In-Memory Architectures
MA Ibrahim, S Aga
Proceedings of the International Symposium on Memory Systems, 13-25, 2024
32024
JIT-Q: Just-in-time Quantization with Processing-In-Memory for Efficient ML Training
M Ibrahim, S Aga, A Li, S Pati, M Islam
Proceedings of Machine Learning and Systems 6, 46-59, 2024
32024
Layer-wise Performance Bottleneck Analysis of Deep Neural Networks
H Zhao, C Weinshenker, M Ibrahim, A Jog, J Zhao
The 1st International Workshop on Architectures for Intelligent Machine, 2017
22017
Methodology for Fine-Grain GPU Power Visibility and Insights
V Singhania, S Aga, MA Ibrahim
arXiv preprint arXiv:2412.12426, 2024
12024
Local Triggering of Processing-in-Memory Operations
MAAE Ibrahim, SD Aga, M Islam
US Patent App. 18/207,314, 2024
12024
The system can't perform the operation now. Try again later.
Articles 1–20