Amirkeivan Mohtashami

Cited by

	All	Since 2021
Citations	1317	1316
h-index	11	11
i10-index	11	11

780

390

195

585

2022202320242025202612 62 452 767 15

Public access

View all

2 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Martin JaggiEPFLVerified email at epfl.ch
Matteo PagliardiniEPFLVerified email at epfl.ch
Sebastian Urban StichCISPA Helmholtz CenterVerified email at cispa.de
Dan AlistarhProfessor at IST AustriaVerified email at ist.ac.at
Saleh AshkboosETH ZurichVerified email at inf.ethz.ch
Florian HartmannGoogle DeepMindVerified email at google.com
Paul K RubensteinGoogle DeepMindVerified email at google.com
Mohammad RoghaniPhD student, Stanford UniversityVerified email at stanford.edu
Ehsan PajouheshgarPhD Student, EPFLVerified email at epfl.ch

Amirkeivan Mohtashami

EPFL

Verified email at epfl.ch

long context large language models efficient transformers neural network optimization


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Meditron-70b: Scaling medical pretraining for large language models Z Chen, AH Cano, A Romanou, A Bonnet, K Matoba, F Salvi, ... arXiv preprint arXiv:2311.16079, 2023	557	2023
Quarot: Outlier-free 4-bit inference in rotated llms S Ashkboos, A Mohtashami, ML Croci, B Li, P Cameron, M Jaggi, ... Advances in Neural Information Processing Systems 37, 100213-100240, 2024	343	2024
Landmark Attention: Random-Access Infinite Context Length for Transformers A Mohtashami, M Jaggi Advances in Neural Information Processing Systems (NeurIPS) 2023, 2023	225*	2023
Masked Training of Neural Networks with Partial Gradients A Mohtashami, M Jaggi, SU Stich The 25th International Conference on Artificial Intelligence and Statistics, 2021	49*	2021
Critical parameters for scalable distributed learning with large batches and asynchronous updates S Stich, A Mohtashami, M Jaggi International Conference on Artificial Intelligence and Statistics, 4042-4050, 2021	24	2021
Special Properties of Gradient Descent with Large Learning Rates A Mohtashami, M Jaggi, S Stich ICML 2023, 2022	20*	2022
The splay-list: A distribution-adaptive concurrent skip-list V Aksenov, D Alistarh, A Drozdova, A Mohtashami 34th International Symposium on Distributed Computing 179, 2020	19	2020
Characterizing & finding good data orderings for fast convergence of sequential gradient methods A Mohtashami, S Stich, M Jaggi arXiv preprint arXiv:2202.01838, 2022	17	2022
Denseformer: Enhancing information flow in transformers via depth weighted averaging M Pagliardini, A Mohtashami, F Fleuret, M Jaggi Advances in neural information processing systems 37, 136479-136508, 2024	15	2024
Social Learning: Towards Collaborative Learning with Large Language Models A Mohtashami, F Hartmann, S Gooding, L Zilka, M Sharifi, ... arXiv preprint arXiv:2312.11441, 2023	14	2023
Cotformer: A chain-of-thought driven architecture with budget-adaptive computation cost at inference A Mohtashami, M Pagliardini, M Jaggi arXiv preprint arXiv:2310.10845, 2023	13	2023
CoTFormer: More Tokens With Attention Make Up For Less Depth A Mohtashami, M Pagliardini, M Jaggi Workshop on Advancing Neural Network Training @ NeurIPS 2023, 2023	7	2023
Learning Translation Quality Evaluation on Low Resource Languages from Large Language Models A Mohtashami, M Verzetti, PK Rubenstein Practical ML for Developing Countries Workshop @ ICLR 2023, 2023	7	2023
Meditron: Open medical foundation models adapted for clinical practice Z Chen, A Romanou, A Bonnet, A Hernández-Cano, B Alkhamissi, ...	6	2024
TPS (task preparation system): A tool for developing tasks in programming contests K MIRJALALI, AK MOHTASHAMI, M ROGHANI, H ZARRABI-ZADEH Olympiads in Informatics 13, 209-215, 2019	1	2019
Reproducibility Report for "On Warm-Starting Neural Network Training" A Mohtashami, E Pajouheshgar, K Kireev ML Reproducibility Challenge 2020, 2021		2021
34th International Symposium on Distributed Computing (DISC 2020) S Assadi, A Bernstein, Z Langley, A Rinberg, I Keidar, V Aksenov, ... Schloss Dagstuhl-Leibniz-Zentrum für Informatik GmbH, 2020		2020
LIPIcs, Volume 179, DISC 2020, Complete Volume}} H Attiya, S Assadi, A Bernstein, Z Langley, A Rinberg, I Keidar, V Aksenov, ... 34th International Symposium on Distributed Computing (DISC 2020) 179, 0, 2020		2020
A Gradient-Based Approach to Neural Networks Structure Learning AA Moinfar, A Mohtashami, M Soleymani, A Sharifi-Zarchi

The system can't perform the operation now. Try again later.

Articles 1–19

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors