[go: up one dir, main page]

Follow
Utku Evci
Utku Evci
Researcher @Google Deepmind
Verified email at nyu.edu - Homepage
Title
Cited by
Cited by
Year
Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities
G Comanici, E Bieber, M Schaekermann, I Pasupat, N Sachdeva, I Dhillon, ...
arXiv preprint arXiv:2507.06261, 2025
13372025
Scaling vision transformers to 22 billion parameters
M Dehghani, J Djolonga, B Mustafa, P Padlewski, J Heek, J Gilmer, ...
International conference on machine learning, 7480-7512, 2023
8842023
Meta-dataset: A dataset of datasets for learning to learn from few examples
E Triantafillou, T Zhu, V Dumoulin, P Lamblin, U Evci, K Xu, R Goroshin, ...
arXiv preprint arXiv:1903.03096, 2019
8572019
Rigging the lottery: Making all tickets winners
U Evci, T Gale, J Menick, PS Castro, E Elsen
International conference on machine learning, 2943-2952, 2020
8462020
Empirical analysis of the hessian of over-parametrized neural networks
L Sagun, U Evci, VU Guney, Y Dauphin, L Bottou
arXiv preprint arXiv:1706.04454, 2017
4882017
The dormant neuron phenomenon in deep reinforcement learning
G Sokar, R Agarwal, PS Castro, U Evci
International Conference on Machine Learning, 32145-32168, 2023
1692023
The difficulty of training sparse neural networks
U Evci, F Pedregosa, A Gomez, E Elsen
arXiv preprint arXiv:1906.10732, 2019
1212019
Head2toe: Utilizing intermediate representations for better transfer learning
U Evci, V Dumoulin, H Larochelle, MC Mozer
International Conference on Machine Learning, 6009-6033, 2022
1192022
Gradient flow in sparse neural networks and how lottery tickets win
U Evci, Y Ioannou, C Keskin, Y Dauphin
Proceedings of the AAAI conference on artificial intelligence 36 (6), 6577-6586, 2022
1002022
Gradmax: Growing neural networks using gradient information
U Evci, B van Merrienboer, T Unterthiner, M Vladymyrov, F Pedregosa
arXiv preprint arXiv:2201.05125, 2022
822022
A practical sparse approximation for real time recurrent learning
J Menick, E Elsen, U Evci, S Osindero, K Simonyan, A Graves
arXiv preprint arXiv:2006.07232, 2020
67*2020
Comparing transfer and meta learning approaches on a unified few-shot classification benchmark
V Dumoulin, N Houlsby, U Evci, X Zhai, R Goroshin, S Gelly, H Larochelle
arXiv preprint arXiv:2104.02638, 2021
65*2021
The state of sparse training in deep reinforcement learning
L Graesser, U Evci, E Elsen, PS Castro
International Conference on Machine Learning, 7766-7792, 2022
622022
Dynamic sparse training with structured sparsity
M Lasby, A Golubeva, U Evci, M Nica, Y Ioannou
arXiv preprint arXiv:2305.02299, 2023
392023
Scaling laws for sparsely-connected foundation models
E Frantar, C Riquelme, N Houlsby, D Alistarh, U Evci
arXiv preprint arXiv:2309.08520, 2023
382023
Progressive gradient flow for robust n: M sparsity training in transformers
AR Bambhaniya, A Yazdanbakhsh, S Subramanian, SC Kao, S Agrawal, ...
arXiv preprint arXiv:2402.04744, 2024
172024
Jaxpruner: A concise library for sparsity research
JH Lee, W Park, NE Mitchell, J Pilault, JSO Ceron, HB Kim, N Lee, ...
Conference on Parsimony and Learning, 515-528, 2024
162024
Training Recipe for N: M Structured Sparsity with Decaying Pruning Mask
A Yazdanbakhsh, SC Kao, S Agrawal, S Subramanian, T Krishna, U Evci
arXiv preprint arXiv:2209.07617, 2022
152022
Detecting dead weights and units in neural networks
U Evci
arXiv preprint arXiv:1806.06068, 2018
132018
Compression scaling laws: Unifying sparsity and quantization
E Frantar, U Evci, W Park, N Houlsby, D Alistarh
arXiv preprint arXiv:2502.16440, 2025
72025
The system can't perform the operation now. Try again later.
Articles 1–20