[go: up one dir, main page]

Follow
Jon Stearley
Jon Stearley
Faith Comes By Hearing
Verified email at sandia.gov
Title
Cited by
Cited by
Year
What supercomputers say: A study of five system logs
A Oliner, J Stearley
37th annual IEEE/IFIP international conference on dependable systems and …, 2007
8712007
Addressing failures in exascale computing
M Snir, RW Wisniewski, JA Abraham, SV Adve, S Bagchi, P Balaji, J Belak, ...
The International Journal of High Performance Computing Applications 28 (2 …, 2014
5612014
Memory errors in modern systems: The good, the bad, and the ugly
V Sridharan, N DeBardeleben, S Blanchard, KB Ferreira, J Stearley, ...
ACM SIGARCH Computer Architecture News 43 (1), 297-310, 2015
4192015
Evaluating the viability of process replication reliability for exascale systems
K Ferreira, J Stearley, JH Laros III, R Oldfield, K Pedretti, R Brightwell, ...
Proceedings of 2011 International Conference for High Performance Computing …, 2011
3412011
Feng shui of supercomputer memory: Positional effects in DRAM and SRAM faults
V Sridharan, J Stearley, N DeBardeleben, S Blanchard, S Gurumurthi
Proceedings of the International Conference on High Performance Computing …, 2013
2582013
Towards informatic analysis of syslogs
J Stearley
2004 IEEE International Conference on Cluster Computing (IEEE Cat. No …, 2004
1952004
Alert detection in system logs
AJ Oliner, A Aiken, J Stearley
2008 Eighth IEEE International Conference on Data Mining, 959-964, 2008
1342008
Bad words: Finding faults in spirit's syslogs
J Stearley, AJ Oliner
2008 Eighth IEEE International Symposium on Cluster Computing and the Grid …, 2008
952008
Bridging the gaps: Joining information sources with splunk
J Stearley, S Corwell, K Lord
Workshop on Managing Systems via Log Analysis and Machine Learning …, 2010
542010
Inter-agency workshop on hpc resilience at extreme scale
J Daly, B Harrod, T Hoang, L Nowell, B Adolf, S Borkar, N DeBardeleben, ...
National Security Agency Advanced Computing Systems, 2012
452012
Does partial replication pay off?
J Stearley, K Ferreira, D Robinson, J Laros, K Pedretti, D Arnold, ...
IEEE/IFIP International Conference on Dependable Systems and Networks …, 2012
382012
Increasing fault resiliency in a message-passing environment
K Ferreira, R Riesen, R Oldfield, J Stearley, J Laros, K Pedretti, ...
Sandia National Laboratories, Technical report SAND2009-6753, 2009
382009
Defining and measuring supercomputer Reliability, Availability, and Serviceability (RAS)
J Stearley
Proceedings of the Linux clusters institute conference, 2005
372005
Redundant computing for exascale systems
R Riesen, K Ferreira, J Stearley, R Oldfield, JH Laros III, K Pedretti, ...
Sandia National Laboratories, 2010
352010
See applications run and throughput jump: The case for redundant computing in HPC
R Riesen, K Ferreira, J Stearley
2010 International Conference on Dependable Systems and Networks Workshops …, 2010
302010
rMPI: increasing fault resiliency in a message-passing environment
K Ferreira, R Riesen, R Oldfield, J Stearley, J Laros, K Pedretti, ...
Sandia National Laboratories, Albuquerque, NM, Tech. Rep. SAND2011-2488, 2011
252011
JHL III, R
K Ferreira, R Riesen, P Bridges, D Arnold, J Stearley
Oldfield, K. Pedretti, and R. Brightwell,“Evaluating the viability of …, 2011
232011
Extra bits on SRAM and DRAM errors–more data from the field
N DeBardeleben, S Blanchard, V Sridharan, S Gurumurthi, J Stearley, ...
IEEE Workshop on Silicon Errors in Logic-System Effects (SELSE), 2014
202014
Sisyphus log data mining toolkit
J Stearley
Accessed from the Web, 2009
132009
A {State-Machine} Approach to Disambiguating Supercomputer Event Logs
J Stearley, R Ballance, L Bauman
102012
The system can't perform the operation now. Try again later.
Articles 1–20