[go: up one dir, main page]

Follow
Antonio Orvieto
Antonio Orvieto
ELLIS Institute Tübingen, Max Planck Institute for Intelligent Systems
Verified email at tue.ellis.eu - Homepage
Title
Cited by
Cited by
Year
Resurrecting recurrent neural networks for long sequences
A Orvieto, SL Smith, A Gu, A Fernando, C Gulcehre, R Pascanu, S De
International Conference on Machine Learning, 26670-26698, 2023
5062023
Learning explanations that are hard to vary
G Parascandolo, A Neitz, A Orvieto, L Gresele, B Schölkopf
International Conference on Learning Representations (2021), 2020
2532020
Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse
L Noci, S Anagnostidis, L Biggio, A Orvieto, SP Singh, A Lucchi
Advances in Neural Information Processing Systems (NeurIPS) 2022, 2022
1332022
Faster single-loop algorithms for minimax optimization without strong concavity
J Yang, A Orvieto, A Lucchi, N He
International conference on artificial intelligence and statistics, 5485-5517, 2022
952022
Achieving a better stability-plasticity trade-off via auxiliary networks in continual learning
S Kim, L Noci, A Orvieto, T Hofmann
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2023
792023
A continuous-time perspective for modeling acceleration in Riemannian optimization
F Alimisis, A Orvieto, G Bécigneul, A Lucchi
International Conference on Artificial Intelligence and Statistics, 1297-1307, 2020
792020
Momentum improves optimization on Riemannian manifolds
F Alimisis, A Orvieto, G Becigneul, A Lucchi
International conference on artificial intelligence and statistics, 1351-1359, 2021
74*2021
Theoretical foundations of deep selective state-space models
N Muca Cirone, A Orvieto, B Walker, C Salvi, T Lyons
Advances in Neural Information Processing Systems 37, 127226-127272, 2024
702024
Recurrent neural networks: vanishing and exploding gradients are not the end of the story
N Zucchet, A Orvieto
Advances in Neural Information Processing Systems 37, 139402-139443, 2024
662024
Anticorrelated noise injection for improved generalization
A Orvieto, H Kersting, F Proske, F Bach, A Lucchi
International Conference on Machine Learning (ICML), 2022, 2022
652022
Dynamics of SGD with Stochastic Polyak Stepsizes: Truly Adaptive Variants and Convergence to Exact Solution
A Orvieto, S Lacoste-Julien, N Loizou
Advances in Neural Information Processing Systems (NeurIPS) 2022, 2022
542022
Universality of linear recurrences followed by non-linear projections: finite-width guarantees and benefits of complex eigenvalues
A Orvieto, S De, C Gulcehre, R Pascanu, SL Smith
arXiv preprint arXiv:2307.11888, 2023
49*2023
Continuous-time models for stochastic optimization algorithms
A Orvieto, A Lucchi
Advances in Neural Information Processing Systems 32 (2019), 2018
472018
Explicit regularization in overparametrized models via noise injection
A Orvieto, A Raj, H Kersting, F Bach
International Conference on Artificial Intelligence and Statistics, 7265-7287, 2023
412023
Understanding the differences in foundation models: Attention, state space models, and recurrent neural networks
J Sieber, CA Alonso, A Didier, MN Zeilinger, A Orvieto
Advances in Neural Information Processing Systems 37, 134534-134566, 2024
382024
An sde for modeling sam: Theory and insights
EM Compagnoni, L Biggio, A Orvieto, FN Proske, H Kersting, A Lucchi
International Conference on Machine Learning, 25209-25253, 2023
362023
On the effectiveness of randomized signatures as reservoir for learning rough dynamics
EM Compagnoni, A Scampicchio, L Biggio, A Orvieto, T Hofmann, ...
2023 International Joint Conference on Neural Networks (IJCNN), 1-8, 2023
34*2023
The role of memory in stochastic optimization
A Orvieto, J Kohler, A Lucchi
Uncertainty in Artificial Intelligence, 356-366, 2020
332020
Shadowing properties of optimization algorithms
A Orvieto, A Lucchi
Advances in Neural Information Processing Systems 32 (2019), 2019
242019
Super consistency of neural network landscapes and learning rate transfer
L Noci, A Meterez, T Hofmann, A Orvieto
Advances in Neural Information Processing Systems 37, 102696-102743, 2024
23*2024
The system can't perform the operation now. Try again later.
Articles 1–20