Mostafa Dehghani

Cited by

	All	Since 2021
Citations	119611	118167
h-index	53	51
i10-index	94	77

46000

23000

11500

34500

20192020202120222023202420252026355 462 2824 10519 21351 36477 45177 1549

Public access

View all

20 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Alexey DosovitskiyInceptiveVerified email at inceptive.team
Yi TayResearch Scientist, Google DeepMindVerified email at google.com
Anurag ArnabGoogle DeepMindVerified email at google.com
Jaap KampsUniversity of AmsterdamVerified email at uva.nl
Donald MetzlerGoogle DeepMindVerified email at google.com
Georg HeigoldResearch Scientist, Google Inc.Verified email at google.com
Neil HoulsbyGoogleVerified email at google.com
Mario LučićResearch Scientist, Google DeepMindVerified email at google.com
Matthias MindererMember of Technical Staff, Microsoft AIVerified email at microsoft.com
Dara BahriResearch Scientist, Google DeepMindVerified email at google.com
Dirk WeissenbornInceptive Inc.Verified email at inceptive.team
Xiaohua ZhaiMeta, OpenAI, Google DeepMindVerified email at google.com
Jakob UszkoreitInceptiveVerified email at uszkoreit.net
Alexey A. GritsenkoGoogle DeepMindVerified email at google.com
Vinh Q. TranResearch Scientist, Google DeepMindVerified email at google.com
Cordelia SchmidResearch director INRIA Verified email at inria.fr
Samira AbnarApple ML ResearchVerified email at apple.com
Hosein AzarbonyadElsevierVerified email at elsevier.com
Chen SunAssistant Professor, Brown UniversityVerified email at brown.edu
maarten marxAssistant Professor of Computer Science, University of AmsterdamVerified email at uva.nl

Mostafa Dehghani

Research Scientist, Google DeepMind

Verified email at google.com

Machine Learning Deep Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
An image is worth 16x16 words: Transformers for image recognition at scale A Dosovitskiy, L Beyer, A Kolesnikov, D Weissenborn, X Zhai, ... arXiv preprint arXiv:2010.11929, 2020	82638	2020
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, JB Alayrac, J Yu, R Soricut, J Schalkwyk, ... arXiv preprint arXiv:2312.11805, 2023	6995	2023
Scaling instruction-finetuned language models HW Chung, L Hou, S Longpre, B Zoph, Y Tay, W Fedus, Y Li, X Wang, ... Journal of Machine Learning Research 25 (70), 1-53, 2024	5299	2024
Vivit: A video vision transformer A Arnab, M Dehghani, G Heigold, C Sun, M Lučić, C Schmid arXiv preprint arXiv:2103.15691, 2021	3721	2021
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context G Team, P Georgiev, VI Lei, R Burnell, L Bai, A Gulati, G Tanzer, ... arXiv preprint arXiv:2403.05530, 2024	3439	2024
Palm 2 technical report R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ... arXiv preprint arXiv:2305.10403, 2023	2252	2023
Efficient Transformers survey DM Yi Tay, Mostafa Dehghani, Dara Bahri ACM Computing Survey 55 (6), 1–28, 2022	2019*	2022
Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities G Comanici, E Bieber, M Schaekermann, I Pasupat, N Sachdeva, I Dhillon, ... arXiv preprint arXiv:2507.06261, 2025	1337	2025
Universal Transformers M Dehghani, S Gouws, O Vinyals, J Uszkoreit, Ł Kaiser International Conference on Learning Representations (ICLR), 2019	1241	2019
Long Range Arena: A Benchmark for Efficient Transformers Y Tay, M Dehghani, S Abnar, Y Shen, D Bahri, P Pham, J Rao, L Yang, ... arXiv preprint arXiv:2011.04006, 2020	979	2020
Scaling vision transformers to 22 billion parameters M Dehghani, J Djolonga, B Mustafa, P Padlewski, J Heek, J Gilmer, ... International conference on machine learning, 7480-7512, 2023	884	2023
Simple open-vocabulary object detection M Minderer, A Gritsenko, A Stone, M Neumann, D Weissenborn, ... European conference on computer vision, 728-755, 2022	856	2022
Ul2: Unifying language learning paradigms Y Tay, M Dehghani, VQ Tran, X Garcia, J Wei, X Wang, HW Chung, ... arXiv preprint arXiv:2205.05131, 2022	458	2022
Neural Ranking Models with Weak Supervision M Dehghani, H Zamani, A Severyn, J Kamps, WB Croft The 40th International ACM SIGIR Conference on Research and Development in …, 2017	456	2017
Transformer memory as a differentiable search index Y Tay, V Tran, M Dehghani, J Ni, D Bahri, H Mehta, Z Qin, K Hui, Z Zhao, ... Advances in Neural Information Processing Systems 35, 21831-21843, 2022	416	2022
Parameter-efficient multi-task fine-tuning for transformers via shared hypernetworks RK Mahabadi, S Ruder, M Dehghani, J Henderson arXiv preprint arXiv:2106.04489, 2021	411	2021
Metnet: A neural weather model for precipitation forecasting CK Sønderby, L Espeholt, J Heek, M Dehghani, A Oliver, T Salimans, ... arXiv preprint arXiv:2003.12140, 2020	404	2020
Confident adaptive language modeling T Schuster, A Fisch, J Gupta, M Dehghani, D Bahri, V Tran, Y Tay, ... Advances in Neural Information Processing Systems 35, 17456-17472, 2022	340	2022
Pali-x: On scaling up a multilingual vision and language model X Chen, J Djolonga, P Padlewski, B Mustafa, S Changpinyo, J Wu, ... arXiv preprint arXiv:2305.18565, 2023	274	2023
From neural re-ranking to neural ranking: Learning a sparse representation for inverted indexing H Zamani, M Dehghani, WB Croft, E Learned-Miller, J Kamps Proceedings of the 27th ACM international conference on information and …, 2018	253	2018

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors