Andrew M. Bean

Cited by

	All	Since 2021
Citations	446	445
h-index	6	6
i10-index	6	6

320

160

240

20232024202520262 106 319 17

Public access

View all

3 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Hannah Rose KirkUniversity of OxfordVerified email at oii.ox.ac.uk
Scott A. HaleOxford Internet Institute, University of Oxford and MeedanVerified email at oii.ox.ac.uk
Bertie VidgenOxford, MercorVerified email at mercor.com
Paul RöttgerBocconi UniversityVerified email at unibocconi.it
Katerina MargatinaApplied Scientist, AWS Agentic AI LabsVerified email at amazon.com
Max BartoloGoogle DeepMind, UCLVerified email at cs.ucl.ac.uk
Alexander WhitefieldThe Wharton SchoolVerified email at wharton.upenn.edu
Adam MahdiAssociate Professor, University of OxfordVerified email at eng.ox.ac.uk
He HeNew York UniversityVerified email at cs.nyu.edu
Adina WilliamsResearch Scientist, FAIR (Meta Platforms Inc.)Verified email at nyu.edu
Harry MayneUniversity of OxfordVerified email at oii.ox.ac.uk
Manuel TonneauUniversity of Oxford, World Bank, New York UniversityVerified email at oii.ox.ac.uk
Khyati KhandelwalUniversity of Oxford, Google IndiaVerified email at oii.ox.ac.uk
Robert MccraithUniversity of OxfordVerified email at robots.ox.ac.uk
Karolina KorgulOxford Internet Institute, University of OxfordVerified email at stx.ox.ac.uk
Felix KronesOxford UniversityVerified email at wolfson.ox.ac.uk
Jabez MagomereUniversity of OxfordVerified email at keble.ox.ac.uk
Yushi YangPhD student at University of OxfordVerified email at oii.ox.ac.uk
Ryan Othniel KearnsOxford Internet Institute; Oxford Human-Centered AI LabVerified email at oii.ox.ac.uk
Luc RocherAssociate Professor, University of OxfordVerified email at mit.edu

Andrew M. Bean

University of Oxford

Verified email at oii.ox.ac.uk - Homepage

benchmarking and evaluation human-AI teamwork large language models


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
The PRISM Alignment Dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models HR Kirk, A Whitefield, P Röttger, AM Bean, K Margatina, R Mosquera, ... The Thirty-eighth Conference on Neural Information Processing Systems …, 2024	244*	2024
The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values HR Kirk, AM Bean, B Vidgen, P Röttger, SA Hale Empirical Methods in Natural Language Processing, 2409–2430, 2023	72	2023
Indian-BhED: A Dataset for Measuring India-Centric Biases in Large Language Models K Khandelwal, M Tonneau, AM Bean, HR Kirk, SA Hale GoodIT '24: Proceedings of the 2024 International Conference on Information …, 2024	65*	2024
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low Resource and Extinct Languages AM Bean, S Hellsten, H Mayne, J Magomere, E Chi, R Chi, SA Hale, ... Advances in Neural Information Processing Systems 37, 26224-26237, 2024	32	2024
Clinical knowledge in LLMs does not translate to human interactions AM Bean, R Payne, G Parsons, HR Kirk, J Ciro, R Mosquera, ... arXiv preprint arXiv:2504.18919, 2025	12	2025
Do Large Language Models have Shared Weaknesses in Medical Question Answering? AM Bean, K Korgul, F Krones, R McCraith, A Mahdi AIM-FM Workshop @ NeurIPS'24, arXiv: 2310.07225, 2024	10*	2024
LLMs Don’t Know Their Own Decision Boundaries: The Unreliability of Self-Generated Counterfactual Explanations H Mayne, RO Kearns, Y Yang, AM Bean, ED Delaney, C Russell, A Mahdi Proceedings of the 2025 Conference on Empirical Methods in Natural Language …, 2025	4	2025
Measuring what Matters: Construct Validity in Large Language Model Benchmarks AM Bean, RO Kearns, A Romanou, FS Hafner, H Mayne, J Batzner, ... arXiv preprint arXiv:2511.04703, 2025	3	2025
Evaluating Fine-Tuning Efficiency of Human-Inspired Learning Strategies in Medical Question Answering Y Yang, AM Bean, R McCraith, A Mahdi NeurIPS 2024 Workshop on Fine-Tuning in Modern Machine Learning: Principles …, 2024	3*	2024
LINGOLY-TOO: Disentangling Reasoning from Knowledge with Templatised Orthographic Obfuscation J Khouja, K Korgul, S Hellsten, L Yang, V Neacsu, H Mayne, R Kearns, ... arXiv preprint arXiv:2503.02972, 2025	1	2025
Scales++: Compute Efficient Evaluation Subset Selection with Cognitive Scales Embeddings AM Bean, N Seedat, S Chen, JR Schwarz arXiv preprint arXiv:2510.26384, 2025		2025
Evaluating the role of 'Constitutions' for learning from AI feedback S Redgate, AM Bean, A Mahdi arXiv preprint arXiv:2411.10168, 2024		2024

The system can't perform the operation now. Try again later.

Articles 1–12

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors