| Resurrecting recurrent neural networks for long sequences A Orvieto, SL Smith, A Gu, A Fernando, C Gulcehre, R Pascanu, S De International Conference on Machine Learning, 26670-26698, 2023 | 506 | 2023 |
| Learning explanations that are hard to vary G Parascandolo, A Neitz, A Orvieto, L Gresele, B Schölkopf International Conference on Learning Representations (2021), 2020 | 253 | 2020 |
| Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse L Noci, S Anagnostidis, L Biggio, A Orvieto, SP Singh, A Lucchi Advances in Neural Information Processing Systems (NeurIPS) 2022, 2022 | 133 | 2022 |
| Faster single-loop algorithms for minimax optimization without strong concavity J Yang, A Orvieto, A Lucchi, N He International conference on artificial intelligence and statistics, 5485-5517, 2022 | 95 | 2022 |
| Achieving a better stability-plasticity trade-off via auxiliary networks in continual learning S Kim, L Noci, A Orvieto, T Hofmann Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2023 | 79 | 2023 |
| A continuous-time perspective for modeling acceleration in Riemannian optimization F Alimisis, A Orvieto, G Bécigneul, A Lucchi International Conference on Artificial Intelligence and Statistics, 1297-1307, 2020 | 79 | 2020 |
| Momentum improves optimization on Riemannian manifolds F Alimisis, A Orvieto, G Becigneul, A Lucchi International conference on artificial intelligence and statistics, 1351-1359, 2021 | 74* | 2021 |
| Theoretical foundations of deep selective state-space models N Muca Cirone, A Orvieto, B Walker, C Salvi, T Lyons Advances in Neural Information Processing Systems 37, 127226-127272, 2024 | 70 | 2024 |
| Recurrent neural networks: vanishing and exploding gradients are not the end of the story N Zucchet, A Orvieto Advances in Neural Information Processing Systems 37, 139402-139443, 2024 | 66 | 2024 |
| Anticorrelated noise injection for improved generalization A Orvieto, H Kersting, F Proske, F Bach, A Lucchi International Conference on Machine Learning (ICML), 2022, 2022 | 65 | 2022 |
| Dynamics of SGD with Stochastic Polyak Stepsizes: Truly Adaptive Variants and Convergence to Exact Solution A Orvieto, S Lacoste-Julien, N Loizou Advances in Neural Information Processing Systems (NeurIPS) 2022, 2022 | 54 | 2022 |
| Universality of linear recurrences followed by non-linear projections: finite-width guarantees and benefits of complex eigenvalues A Orvieto, S De, C Gulcehre, R Pascanu, SL Smith arXiv preprint arXiv:2307.11888, 2023 | 49* | 2023 |
| Continuous-time models for stochastic optimization algorithms A Orvieto, A Lucchi Advances in Neural Information Processing Systems 32 (2019), 2018 | 47 | 2018 |
| Explicit regularization in overparametrized models via noise injection A Orvieto, A Raj, H Kersting, F Bach International Conference on Artificial Intelligence and Statistics, 7265-7287, 2023 | 41 | 2023 |
| Understanding the differences in foundation models: Attention, state space models, and recurrent neural networks J Sieber, CA Alonso, A Didier, MN Zeilinger, A Orvieto Advances in Neural Information Processing Systems 37, 134534-134566, 2024 | 38 | 2024 |
| An sde for modeling sam: Theory and insights EM Compagnoni, L Biggio, A Orvieto, FN Proske, H Kersting, A Lucchi International Conference on Machine Learning, 25209-25253, 2023 | 36 | 2023 |
| On the effectiveness of randomized signatures as reservoir for learning rough dynamics EM Compagnoni, A Scampicchio, L Biggio, A Orvieto, T Hofmann, ... 2023 International Joint Conference on Neural Networks (IJCNN), 1-8, 2023 | 34* | 2023 |
| The role of memory in stochastic optimization A Orvieto, J Kohler, A Lucchi Uncertainty in Artificial Intelligence, 356-366, 2020 | 33 | 2020 |
| Shadowing properties of optimization algorithms A Orvieto, A Lucchi Advances in Neural Information Processing Systems 32 (2019), 2019 | 24 | 2019 |
| Super consistency of neural network landscapes and learning rate transfer L Noci, A Meterez, T Hofmann, A Orvieto Advances in Neural Information Processing Systems 37, 102696-102743, 2024 | 23* | 2024 |