| The lightweight distributed metric service: a scalable infrastructure for continuous monitoring of large scale computing systems and applications A Agelastos, B Allan, J Brandt, P Cassella, J Enos, J Fullop, A Gentile, ... SC'14: Proceedings of the International Conference for High Performance …, 2014 | 336 | 2014 |
| Reaction chemistry and optimization of plasma remediation of NxOy from gas streams AC Gentile, MJ Kushner Journal of applied physics 78 (3), 2074-2085, 1995 | 278 | 1995 |
| Resource monitoring and management with OVIS to enable HPC in cloud computing environments J Brandt, A Gentile, J Mayo, P Pebay, D Roe, D Thompson, M Wong 2009 IEEE International Symposium on Parallel & Distributed Processing, 1-8, 2009 | 99 | 2009 |
| Microstreamer dynamics during plasma remediation of NO using atmospheric pressure dielectric barrier discharges AC Gentile, MJ Kushner Journal of applied physics 79 (8), 3877-3885, 1996 | 71 | 1996 |
| OVIS-2: A robust distributed architecture for scalable RAS JM Brandt, BJ Debusschere, AC Gentile, JR Mayo, PP Pébay, ... 2008 IEEE International Symposium on Parallel and Distributed Processing, 1-8, 2008 | 52 | 2008 |
| Toward Rapid Understanding of Production HPC Applications and Systems A Agelastos, B Allan, J Brandt, A Gentile, S Lefantzi, S Monk, J Ogden, ... Cluster Computing (CLUSTER), 2015 IEEE International Conference on, 464-473, 2015 | 46 | 2015 |
| Baler: deterministic, lossless log message clustering tool N Taerat, J Brandt, A Gentile, M Wong, C Leangsuksun Computer Science-Research and Development 26 (3), 285-295, 2011 | 44 | 2011 |
| Ovis: a tool for intelligent, real-time monitoring of computational clusters JM Brandt, AC Gentile, DJ Hale, PP Pébay Proceedings 20th IEEE International Parallel & Distributed Processing …, 2006 | 38 | 2006 |
| Integrating low-latency analysis into HPC system monitoring R Izadpanah, N Naksinehaboon, J Brandt, A Gentile, D Dechev Proceedings of the 47th International Conference on Parallel Processing, 1-10, 2018 | 35 | 2018 |
| Measuring Congestion in {High-Performance} Datacenter Interconnects S Jha, A Patke, J Brandt, A Gentile, B Lim, M Showerman, G Bauer, ... 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2020 | 32 | 2020 |
| Demonstrating improved application performance using dynamic monitoring and task mapping J Brandt, K Devine, A Gentile, K Pedretti 2014 IEEE International Conference on Cluster Computing (CLUSTER), 408-415, 2014 | 28 | 2014 |
| Methodologies for advance warning of compute cluster problems via statistical analysis: A case study J Brandt, A Gentile, J Mayo, P Pébay, D Roe, D Thompson, M Wong Proceedings of the 2009 workshop on Resiliency in high performance, 7-14, 2009 | 28 | 2009 |
| Filtering log data: Finding the needles in the haystack L Yu, Z Zheng, Z Lan, T Jones, JM Brandt, AC Gentile IEEE/IFIP International Conference on Dependable Systems and Networks (DSN …, 2012 | 25 | 2012 |
| Overtime: A tool for analyzing performance variation due to network interference RE Grant, KT Pedretti, A Gentile Proceedings of the 3rd Workshop on Exascale MPI, 1-10, 2015 | 24 | 2015 |
| Enabling Advanced Operational Analysis Through Multi-subsystem Data Integration on Trinity. JM Brandt, D DeBonis, AC Gentile, J Lujan, C Martin, DJ Martinez, ... Sandia National Lab.(SNL-CA), Livermore, CA (United States); Sandia National …, 2015 | 24 | 2015 |
| Using probabilistic characterization to reduce runtime faults in HPC systems J Brandt, B Debusschere, A Gentile, J Mayo, P Pébay, D Thompson, ... 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid …, 2008 | 24 | 2008 |
| Continuous whole-system monitoring toward rapid understanding of production HPC applications and systems A Agelastos, B Allan, J Brandt, A Gentile, S Lefantzi, S Monk, J Ogden, ... Parallel Computing 58, 90-106, 2016 | 23 | 2016 |
| Quantifying effectiveness of failure prediction and response in HPC systems: Methodology and example J Brandt, F Chen, V De Sapio, A Gentile, J Mayo, P Pèbay, D Roe, ... 2010 International Conference on Dependable Systems and Networks Workshops …, 2010 | 23 | 2010 |
| Design and Implementation of a Scalable Monitoring System for Trinity. A DeConinck, A Bonnie, K Kelly, S Sanchez, C Martin, M Mason, ... Sandia National Lab.(SNL-NM), Albuquerque, NM (United States), 2016 | 20 | 2016 |
| Lilith: Scalable execution of user code for distributed computing DA Evensky, AC Gentile, LJ Camp, RC Armstrong Proceedings. The Sixth IEEE International Symposium on High Performance …, 1997 | 20 | 1997 |