| Bootstrap your own latent-a new approach to self-supervised learning JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ... Advances in neural information processing systems 33, 21271-21284, 2020 | 9788 | 2020 |
| Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, JB Alayrac, J Yu, R Soricut, J Schalkwyk, ... arXiv preprint arXiv:2312.11805, 2023 | 6992 | 2023 |
| Training compute-optimal large language models J Hoffmann, S Borgeaud, A Mensch, E Buchatskaya, T Cai, E Rutherford, ... arXiv preprint arXiv:2203.15556, 2022 | 4505* | 2022 |
| Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context G Team, P Georgiev, VI Lei, R Burnell, L Bai, A Gulati, G Tanzer, ... arXiv preprint arXiv:2403.05530, 2024 | 3439 | 2024 |
| Gemma: Open models based on gemini research and technology G Team, T Mesnard, C Hardin, R Dadashi, S Bhupatiraju, S Pathak, ... arXiv preprint arXiv:2403.08295, 2024 | 2255 | 2024 |
| Gemma 2: Improving open language models at a practical size G Team, M Riviere, S Pathak, PG Sessa, C Hardin, S Bhupatiraju, ... arXiv preprint arXiv:2408.00118, 2024 | 1691 | 2024 |
| Scaling language models: Methods, analysis & insights from training gopher JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ... arXiv preprint arXiv:2112.11446, 2021 | 1538 | 2021 |
| Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities G Comanici, E Bieber, M Schaekermann, I Pasupat, N Sachdeva, I Dhillon, ... arXiv preprint arXiv:2507.06261, 2025 | 1337 | 2025 |
| Medgemma technical report A Sellergren, S Kazemzadeh, T Jaroensri, A Kiraly, M Traverse, ... arXiv preprint arXiv:2507.05201, 2025 | 170 | 2025 |
| Unified scaling laws for routed language models A Clark, D de Las Casas, A Guy, A Mensch, M Paganini, J Hoffmann, ... International conference on machine learning, 4057-4086, 2022 | 120 | 2022 |
| Training compute-optimal large language models (2022) J Hoffmann, S Borgeaud, A Mensch, E Buchatskaya, T Cai, E Rutherford, ... arXiv preprint arXiv:2203.15556, 2022 | 116 | 2022 |
| Cyprien de Masson d’Autume, Yujia Li, Tayfun Terzi, Vladimir Mikulik, Igor Babuschkin, Aidan Clark, Diego de Las Casas, Aurelia Guy, Chris Jones, James Bradbury, Matthew J JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, HF Song, J Aslanides, ... Johnson, Blake A. Hechtman, Laura Weidinger, Iason Gabriel, William Isaac …, 2021 | 83 | 2021 |
| Bootstrap your own latent: A new approach to self-supervised Learning. arXiv JB Grill, F Strub, F Altché, C Tallec, PH Richemond, E Buchatskaya, ... arXiv preprint arXiv:2006.07733, 2020 | 57 | 2020 |
| Training compute-optimal large language models. arXiv 2022 J Hoffmann, S Borgeaud, A Mensch, E Buchatskaya, T Cai, E Rutherford, ... arXiv preprint arXiv:2203.15556 10, 2022 | 48 | 2022 |
| Gemma: Open models based on gemini research and technology, 2024a G Team, T Mesnard, C Hardin, R Dadashi, S Bhupatiraju, S Pathak, ... URL https://arxiv. org/abs/2403.08295, 0 | 32 | |