| Astrollama: Towards specialized foundation models in astronomy TD Nguyen, YS Ting, I Ciuca, C O’Neill, ZC Sun, M Jabłońska, S Kruk, ... Proceedings of the Second Workshop on Information Extraction from Scientific …, 2023 | 39 | 2023 |
| Astrollama-chat: Scaling astrollama with conversational and diverse datasets E Perkowski, R Pan, TD Nguyen, YS Ting, S Kruk, T Zhang, C O’Neill, ... Research Notes of the AAS 8 (1), 7, 2024 | 24 | 2024 |
| Disentangling dense embeddings with sparse autoencoders C O'Neill, C Ye, K Iyer, JF Wu arXiv preprint arXiv:2408.00657, 2024 | 21 | 2024 |
| pathfinder: A semantic framework for literature review and knowledge discovery in astronomy KG Iyer, M Yunus, C O’Neill, C Ye, A Hyk, K Mccormick, I Ciucă, JF Wu, ... The Astrophysical Journal Supplement Series 275 (2), 38, 2024 | 16 | 2024 |
| Grokking Beyond Neural Networks: An Empirical Exploration with Model Complexity J Miller, C O'Neill, T Bui Transactions on Machine Learning Research, 2023 | 14 | 2023 |
| Sparse autoencoders enable scalable and reliable circuit identification in language models C O'Neill, T Bui arXiv preprint arXiv:2405.12522, 2024 | 11 | 2024 |
| Adversarial fine-tuning of language models: An iterative optimisation approach for the generation and detection of problematic content C O'Neill, J Miller, I Ciuca, YS Ting, T Bui arXiv preprint arXiv:2308.13768, 2023 | 9 | 2023 |
| From superposition to sparse codes: interpretable representations in neural networks D Klindt, C O'Neill, P Reizinger, H Maurer, N Miolane arXiv preprint arXiv:2503.01824, 2025 | 7 | 2025 |
| Sparks of science: Hypothesis generation using structured paper data C O'Neill, T Ghosal, R Răileanu, M Walmsley, T Bui, K Schawinski, I Ciucă arXiv preprint arXiv:2504.12976, 2025 | 6 | 2025 |
| Compute Optimal Inference and Provable Amortisation Gap in Sparse Autoencoders C O'Neill, A Gumran, D Klindt International Conference on Machine Learning (ICML) 2025, 2024 | 6 | 2024 |
| UniverseTBD. 2023 TD Nguyen, YS Ting, I Ciucă, C O’Neill, ZC Sun, M Jabłońska, S Kruk, ... Astrollama: Towards specialized foundation models in astronomy, 0 | 5 | |
| Steering language generation: Harnessing contrastive expert guidance and negative prompting for coherent and diverse synthetic data generation C O'Neill, YS Ting, I Ciuca, J Miller, T Bui arXiv preprint arXiv:2308.07645, 2023 | 3 | 2023 |
| Eigenvalue initialisation and regularisation for Koopman autoencoders JW Miller, C O'Neill, NC Constantinou, O Azencot arXiv preprint arXiv:2212.12086, 2022 | 2 | 2022 |
| Sparse autoencoders for dense text embeddings reveal hierarchical feature sub-structure C Ye*, C O'Neill*, JF Wu, KG Iyer NeurIPS 2024 Workshop on Scientific Methods for Understanding Deep Learning, 0 | 2 | |
| Towards Interpretable Scientific Foundation Models: Sparse Autoencoders for Disentangling Dense Embeddings of Scientific Concepts C O'Neill, C Ye, KG Iyer, JF Wu Neurips 2024 Workshop Foundation Models for Science: Progress, Opportunities …, 0 | 2 | |
| Resurrecting the Salmon: Rethinking Mechanistic Interpretability with Domain-Specific Sparse Autoencoders C O'Neill, M Jayasekara, M Kirkby arXiv preprint arXiv:2508.09363, 2025 | 1 | 2025 |
| Measuring Sharpness in Grokking J Miller, P Gleeson, C O'Neill, T Bui, N Levi ICLR 2024 Workshop on Bridging the Gap Between Practice and Theory in Deep …, 2024 | 1 | 2024 |
| Steering semantic search with interpretable features from sparse autoencoders C Ye*, C O'Neill*, JF Wu, KG Iyer 🍃 MINT: Foundation Model Interventions, 0 | 1 | |
| A Single Direction of Truth: An Observer Model's Linear Residual Probe Exposes and Steers Contextual Hallucinations C O'Neill, S Chalnev, CC Zhao, M Kirkby, M Jayasekara ICML 2025, Actionable Interpretability Workshop, 2025 | | 2025 |
| Self-Attention as a Parametric Endofunctor: A Categorical Framework for Transformer Architectures C O'Neill arXiv preprint arXiv:2501.02931, 2025 | | 2025 |