| Google's neural machine translation system: Bridging the gap between human and machine translation Y Wu arXiv preprint arXiv:1609.08144, 2016 | 10236 | 2016 |
| In-datacenter performance analysis of a tensor processing unit NP Jouppi, C Young, N Patil, D Patterson, G Agrawal, R Bajwa, S Bates, ... Proceedings of the 44th annual international symposium on computer …, 2017 | 6912 | 2017 |
| Anton, a special-purpose machine for molecular dynamics simulation DE Shaw, MM Deneroff, RO Dror, JS Kuskin, RH Larson, JK Salmon, ... Communications of the ACM 51 (7), 91-97, 2008 | 1044 | 2008 |
| Anton 2: raising the bar for performance and programmability in a special-purpose molecular dynamics supercomputer DE Shaw, JP Grossman, JA Bank, B Batson, JA Butts, JC Chao, ... SC'14: Proceedings of the International Conference for High Performance …, 2014 | 814 | 2014 |
| Millisecond-scale molecular dynamics simulations on Anton DE Shaw, RO Dror, JK Salmon, JP Grossman, KM Mackenzie, JA Bank, ... Proceedings of the conference on high performance computing networking …, 2009 | 738 | 2009 |
| Tpu v4: An optically reconfigurable supercomputer for machine learning with hardware support for embeddings N Jouppi, G Kurian, S Li, P Ma, R Nagarajan, L Nai, N Patil, ... Proceedings of the 50th annual international symposium on computer …, 2023 | 683 | 2023 |
| Ten lessons from three generations shaped google’s tpuv4i: Industrial product NP Jouppi, DH Yoon, M Ashcraft, M Gottscho, TB Jablin, G Kurian, ... 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture …, 2021 | 558 | 2021 |
| Embedded computing: a VLIW approach to architecture, compilers and tools JA Fisher, P Faraboschi Elsevier, 2005 | 537 | 2005 |
| Mesh-tensorflow: Deep learning for supercomputers N Shazeer, Y Cheng, N Parmar, D Tran, A Vaswani, P Koanantakool, ... Advances in neural information processing systems 31, 2018 | 515 | 2018 |
| Mlperf training benchmark P Mattson, C Cheng, G Diamos, C Coleman, P Micikevicius, D Patterson, ... Proceedings of Machine Learning and Systems 2, 336-349, 2020 | 432 | 2020 |
| A domain-specific supercomputer for training deep neural networks NP Jouppi, DH Yoon, G Kurian, S Li, N Patil, J Laudon, C Young, ... Communications of the ACM 63 (7), 67-78, 2020 | 400 | 2020 |
| Anton, a special-purpose machine for molecular dynamics simulation DE Shaw, MM Deneroff, RO Dror, JS Kuskin, RH Larson, JK Salmon, ... ACM SIGARCH Computer Architecture News 35 (2), 1-12, 2007 | 384 | 2007 |
| Sparse gpu kernels for deep learning T Gale, M Zaharia, C Young, E Elsen SC20: International Conference for High Performance Computing, Networking …, 2020 | 359 | 2020 |
| Motivation for and evaluation of the first tensor processing unit N Jouppi, C Young, N Patil, D Patterson ieee Micro 38 (3), 10-19, 2018 | 354 | 2018 |
| Measurement of longitudinal flow decorrelations in Pb+Pb collisions at and 5.02 TeV with the ATLAS detector M Aaboud, G Aad, B Abbott, O Abdinov, B Abeloos, SH Abidi, ... The European Physical Journal C 78 (2), 142, 2018 | 288* | 2018 |
| A new golden age in computer architecture: Empowering the machine-learning revolution J Dean, D Patterson, C Young IEEE Micro 38 (2), 21-29, 2018 | 275 | 2018 |
| A domain-specific architecture for deep neural networks NP Jouppi, C Young, N Patil, D Patterson Communications of the ACM 61 (9), 50-59, 2018 | 243 | 2018 |
| A comparative analysis of schemes for correlated branch prediction C Young, N Gloy, MD Smith ACM SIGARCH Computer Architecture News 23 (2), 276-286, 1995 | 227 | 1995 |
| Neuromorphic computing at scale D Kudithipudi, C Schuman, CM Vineyard, T Pandit, C Merkel, ... Nature 637 (8047), 801-812, 2025 | 225 | 2025 |
| The design process for Google's training chips: TPUv2 and TPUv3 T Norrie, N Patil, DH Yoon, G Kurian, S Li, J Laudon, C Young, N Jouppi, ... IEEE Micro 41 (2), 56-63, 2021 | 194 | 2021 |