| Mixed precision training P Micikevicius, S Narang, J Alben, G Diamos, E Elsen, D Garcia, ... arXiv preprint arXiv:1710.03740, 2017 | 2734 | 2017 |
| Conservation cores: reducing the energy of mature computations G Venkatesh, J Sampson, N Goulding, S Garcia, V Bryksin, ... ACM SIGARCH Computer Architecture News 38 (1), 205-218, 2010 | 717 | 2010 |
| Can FPGAs beat GPUs in accelerating next-generation deep neural networks? E Nurvitadhi, G Venkatesh, J Sim, D Marr, R Huang, J Ong Gee Hock, ... Proceedings of the 2017 ACM/SIGDA international symposium on field …, 2017 | 695 | 2017 |
| Accelerating binarized neural networks: Comparison of FPGA, CPU, GPU, and ASIC E Nurvitadhi, D Sheffield, J Sim, A Mishra, G Venkatesh, D Marr 2016 International Conference on Field-Programmable Technology (FPT), 77-84, 2016 | 476 | 2016 |
| Accelerating sparse deep neural networks A Mishra, JA Latorre, J Pool, D Stosic, D Stosic, G Venkatesh, C Yu, ... arXiv preprint arXiv:2104.08378, 2021 | 365 | 2021 |
| The GreenDroid mobile application processor: An architecture for silicon's dark future N Goulding-Hotta, J Sampson, G Venkatesh, S Garcia, J Auricchio, ... IEEE Micro 31 (2), 86-95, 2011 | 222 | 2011 |
| Accelerating Deep Convolutional Network via Low Precision and Sparsity G Venkatesh, E Nurvitadhi, D Marr Arxiv, 2016 | 180* | 2016 |
| QsCores: Trading dark silicon for scalable energy efficiency with quasi-specific cores G Venkatesh, J Sampson, N Goulding-Hotta, SK Venkata, MB Taylor, ... Proceedings of the 44th Annual IEEE/ACM International Symposium on …, 2011 | 175 | 2011 |
| Runnemede: An architecture for ubiquitous high-performance computing NP Carter, A Agrawal, S Borkar, R Cledat, H David, D Dunning, J Fryman, ... High Performance Computer Architecture (HPCA2013), 2013 IEEE 19th …, 2013 | 138 | 2013 |
| Unbounded page-based transactional memory W Chuang, S Narayanasamy, G Venkatesh, J Sampson, ... ACM SIGPLAN Notices 41 (11), 347-358, 2006 | 132 | 2006 |
| Mixed precision training. arXiv 2017 P Micikevicius, S Narang, J Alben, G Diamos, E Elsen, D Garcia, ... arXiv preprint arXiv:1710.03740, 0 | 131 | |
| Mixed precision training S Narang, G Diamos, E Elsen, P Micikevicius, J Alben, D Garcia, ... Proc. 6th Int. Conf. on Learning Representations (ICLR), 2018 | 100 | 2018 |
| Hardware accelerator architecture and template for web-scale k-means clustering E Nurvitadhi, G Venkatesh, S Krishnan, S Subhaschandra, D Marr US Patent App. 15/396,515, 2018 | 74 | 2018 |
| Efficient complex operators for irregular codes J Sampson, G Venkatesh, N Goulding-Hotta, S Garcia, S Swanson, ... High Performance Computer Architecture (HPCA), 2011 IEEE 17th International …, 2011 | 64 | 2011 |
| System and method for performing small channel count convolutions in energy-efficient input operand stationary accelerator G Venkatesh, L Lai, I Pierce, J Chuang, M Li US Patent 11,675,998, 2023 | 60 | 2023 |
| GreenDroid: A mobile application processor for a future of dark silicon N Goulding, J Sampson, G Venkatesh, S Garcia, J Auricchio, J Babb, ... Hot Chips 22, 2010 | 60 | 2010 |
| Learning dynamic network using a reuse gate function in semi-supervised video object segmentation H Park, J Yoo, S Jeong, G Venkatesh, N Kwak Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021 | 56 | 2021 |
| Can temporal information help with contrastive self-supervised learning? Y Bai, H Fan, I Misra, G Venkatesh, Y Lu, Y Zhou, Q Yu, V Chandra, ... arXiv preprint arXiv:2011.13046, 2020 | 54 | 2020 |
| Programmable memory prefetcher for prefetching multiple cache lines based on data in a prefetch engine control register G Venkatesh, CB Wilkerson, SH Pugsley, DT Marr US Patent 10,452,551, 2019 | 51 | 2019 |
| Efficient sparse array handling in a processor G Venkatesh, TC Zhang, DT Marr US Patent App. 14/747,182, 2016 | 47 | 2016 |