| Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, JB Alayrac, J Yu, R Soricut, J Schalkwyk, ... arXiv preprint arXiv:2312.11805, 2023 | 7002 | 2023 |
| Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities G Comanici, E Bieber, M Schaekermann, I Pasupat, N Sachdeva, I Dhillon, ... arXiv preprint arXiv:2507.06261, 2025 | 1217 | 2025 |
| Q-bert: Hessian based ultra low precision quantization of bert S Shen, Z Dong, J Ye, L Ma, Z Yao, A Gholami, MW Mahoney, K Keutzer Proceedings of the AAAI conference on artificial intelligence 34 (05), 8815-8821, 2020 | 733 | 2020 |
| Inefficiency of k-fac for large batch size training L Ma, G Montague, J Ye, Z Yao, A Gholami, K Keutzer, M Mahoney Proceedings of the AAAI Conference on Artificial Intelligence 34 (04), 5053-5060, 2020 | 28 | 2020 |
| Autohoot: Automatic high-order optimization for tensors L Ma, J Ye, E Solomonik Proceedings of the ACM International Conference on Parallel Architectures …, 2020 | 14 | 2020 |
| UT5: Pretraining Non autoregressive T5 with unrolled denoising MG Salem, J Ye, CC Lin, F Liu arXiv preprint arXiv:2311.08552, 2023 | | 2023 |