| Mlperf inference benchmark VJ Reddi, C Cheng, D Kanter, P Mattson, G Schmuelling, CJ Wu, ... 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture …, 2020 | 855* | 2020 |
| Benchmarking tinyml systems: Challenges and direction CR Banbury, VJ Reddi, M Lam, W Fu, A Fazel, J Holleman, X Huang, ... arXiv preprint arXiv:2003.04821, 2020 | 397 | 2020 |
| Automatically tuning sparse matrix-vector multiplication for GPU architectures A Monakov, A Lokhmotov, A Avetisyan International Conference on High-Performance Embedded Architectures and …, 2010 | 376 | 2010 |
| Pencil: A platform-neutral compute intermediate language for accelerator programming R Baghdadi, U Beaugnon, A Cohen, T Grosser, M Kruse, C Reddy, ... 2015 International Conference on Parallel Architecture and Compilation (PACT …, 2015 | 166 | 2015 |
| Automatically generating and tuning GPU code for sparse matrix-vector multiplication from a high-level representation D Grewe, A Lokhmotov Proceedings of the Fourth Workshop on General Purpose Processing on Graphics …, 2011 | 74 | 2011 |
| Collective Mind: Towards Practical and Collaborative Auto‐Tuning G Fursin, R Miceli, A Lokhmotov, M Gerndt, M Baboulin, AD Malony, ... Scientific Programming 22 (4), 309-329, 2014 | 49 | 2014 |
| Deriving efficient data movement from decoupled access/execute specifications LW Howes, A Lokhmotov, AF Donaldson, PHJ Kelly International Conference on High-Performance Embedded Architectures and …, 2009 | 45 | 2009 |
| Collective knowledge: Towards R&D sustainability G Fursin, A Lokhmotov, E Plowman 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE), 864-869, 2016 | 38 | 2016 |
| Benchmarking TinyML systems: Challenges and direction. arXiv 2020 CR Banbury, VJ Reddi, M Lam, W Fu, A Fazel, J Holleman, X Huang, ... arXiv preprint arXiv:2003.04821, 2003 | 31 | 2003 |
| VOBLA: A vehicle for optimized basic linear algebra U Beaugnon, A Kravets, S Van Haastregt, R Baghdadi, D Tweed, J Absar, ... Proceedings of the 2014 SIGPLAN/SIGBED conference on Languages, compilers …, 2014 | 28 | 2014 |
| PENCIL 1.0 Language Specification R Baghdadi, A Cohen, T Grosser, S Verdoolaege, J Absar, ... Research Report 8706, INRIA, 2015 | 26* | 2015 |
| PENCIL: Towards a platform-neutral compute intermediate language for DSLs R Baghdadi, A Cohen, S Guelton, S Verdoolaege, J Inoue, T Grosser, ... arXiv preprint arXiv:1302.5586, 2013 | 26 | 2013 |
| A collective knowledge workflow for collaborative research into multi-objective autotuning and machine learning techniques G Fursin, A Lokhmotov, D Savenko, E Upton arXiv preprint arXiv:1801.08024, 2018 | 22 | 2018 |
| Benchmarking TinyML systems: Challenges and direction. arXiv CR Banbury, VJ Reddi, M Lam, W Fu, A Fazel, J Holleman, X Huang, ... arXiv preprint arXiv:2003.04821, 2020 | 21 | 2020 |
| Mlperf power: Benchmarking the energy efficiency of machine learning systems from microwatts to megawatts for sustainable ai A Tschand, ATR Rajan, S Idgunji, A Ghosh, J Holleman, C Kiraly, ... arXiv preprint arXiv:2410.12032, 2024 | 20 | 2024 |
| Collective Mind, Part II: Towards performance-and cost-aware software engineering as a natural science G Fursin, A Memon, C Guillon, A Lokhmotov arXiv preprint arXiv:1506.06256, 2015 | 19 | 2015 |
| MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from μWatts to MWatts for Sustainable AI A Tschand, ATR Rajan, S Idgunji, A Ghosh, J Holleman, C Kiraly, ... 2025 IEEE International Symposium on High Performance Computer Architecture …, 2025 | 18 | 2025 |
| On the anatomy of predictive models for accelerating GPU convolution kernels and beyond PS Labini, M Cianfriglia, D Perri, O Gervasi, G Fursin, A Lokhmotov, ... ACM Transactions on Architecture and Code Optimization (TACO) 18 (1), 1-24, 2021 | 18 | 2021 |
| Auto-parallelisation of Sieve C++ programs A Donaldson, C Riley, A Lokhmotov, A Cook European Conference on Parallel Processing, 18-27, 2007 | 18 | 2007 |
| Delayed side-effects ease multi-core programming A Lokhmotov, A Mycroft, A Richards European Conference on Parallel Processing, 641-650, 2007 | 17 | 2007 |