| Testing and evaluation of health care applications of large language models: a systematic review S Bedi, Y Liu, L Orr-Ewing, D Dash, S Koyejo, A Callahan, JA Fries, ... Jama, 2025 | 348 | 2025 |
| Genome-wide bioinformatic analyses predict key host and viral factors in SARS-CoV-2 pathogenesis MG Ferrarini, A Lal, R Rebollo, AJ Gruber, A Guarracino, IM Gonzalez, ... Communications biology 4 (1), 590, 2021 | 49 | 2021 |
| A systematic review of testing and evaluation of healthcare applications of large language models (LLMs) S Bedi, Y Liu, L Orr-Ewing, D Dash, S Koyejo, A Callahan, JA Fries, ... medRxiv, 2024.04. 15.24305869, 2024 | 38 | 2024 |
| MedHELM: Holistic Evaluation of Large Language Models for Medical Tasks S Bedi, H Cui, M Fuentes, A Unell, M Wornow, JM Banda, N Kotecha, ... arXiv preprint arXiv:2505.23802, 2025 | 36 | 2025 |
| Similarities between bacterial GAD and human GAD65: Implications in gut mediated autoimmune type 1 diabetes S Bedi, TM Richardson, B Jia, H Saab, FSL Brinkman, M Westley PLoS One 17 (2), e0261103, 2022 | 24 | 2022 |
| Dev Dash, Sanmi Koyejo, Alison Callahan, Jason A Fries, Michael Wornow, Akshay Swaminathan, Lisa Soleymani Lehmann, et al. 2024 S Bedi, Y Liu, L Orr-Ewing A systematic review of testing and evaluation of healthcare applications of …, 0 | 24 | |
| Evaluating the clinical benefits of LLMs S Bedi, SS Jain, NH Shah Nature medicine 30 (9), 2409-2410, 2024 | 23 | 2024 |
| Dev Dash, Sanmi Koyejo, Alison Callahan, Jason A S Bedi, Y Liu, L Orr-Ewing medRxiv, 2024 | 22 | 2024 |
| A Bayesian modified Ising model for identifying spatially variable genes from spatial transcriptomics data X Jiang, G Xiao, Q Li Statistics in medicine 41 (23), 4647-4665, 2022 | 22 | 2022 |
| Context clues: Evaluating long context models for clinical prediction tasks on ehrs M Wornow, S Bedi, MAF Hernandez, E Steinberg, JA Fries, C Ré, ... arXiv preprint arXiv:2412.16178, 2024 | 20 | 2024 |
| Measurement to Meaning: A Validity-Centered Framework for AI Evaluation O Salaudeen, A Reuel, A Ahmed, S Bedi, Z Robertson, S Sundar, ... arXiv preprint arXiv:2505.10573, 2025 | 16 | 2025 |
| Fidelity of medical reasoning in large language models S Bedi, Y Jiang, P Chung, S Koyejo, N Shah JAMA Network Open 8 (8), e2526021-e2526021, 2025 | 14 | 2025 |
| Disentangling reasoning and knowledge in medical large language models R Thapa, Q Wu, K Wu, H Zhang, A Zhang, E Wu, H Ye, S Bedi, N Aresh, ... arXiv preprint arXiv:2505.11462, 2025 | 9 | 2025 |
| QUEST-AI: a system for question generation, verification, and refinement using AI for USMLE-Style Exams S Bedi, SL Fleming, CC Chiang, K Morse, A Kumar, B Patel, JA Jindal, ... Biocomputing 2025: Proceedings of the Pacific Symposium, 54-69, 2024 | 8 | 2024 |
| A systematic review of testing and evaluation of healthcare applications of large language models (LLMs). medRxiv S Bedi, Y Liu, L Orr-Ewing, D Dash, S Koyejo, A Callahan, JA Fries, ... Preprint posted online on Apr 16, 2024 | 7 | 2024 |
| The Optimization Paradox in Clinical AI Multi-Agent Systems S Bedi, I Mlauzi, D Shin, S Koyejo, NH Shah arXiv preprint arXiv:2506.06574, 2025 | 5 | 2025 |
| Distilling Large Language Models for Efficient Clinical Information Extraction KS Vedula, A Gupta, A Swaminathan, I Lopez, S Bedi, NH Shah arXiv preprint arXiv:2501.00031, 2024 | 5 | 2024 |
| Feasibility of automatically detecting practice of race-based medicine by large language models A Swaminathan, S Salvi, P Chung, A Callahan, S Bedi, A Unell, ... AAAI 2024 spring symposium on clinical foundation models, 2024 | 5 | 2024 |
| Spatial Transcriptomics Arena (STAr): an Integrated Platform for Spatial Transcriptomics Methodology Research X Jiang, D Luo, E Fernández, J Yang, H Li, KW Jin, Y Zhan, B Yao, S Bedi, ... bioRxiv, 2023.03. 10.532127, 2023 | 3 | 2023 |
| meds_reader: A fast and efficient EHR processing library E Steinberg, M Wornow, S Bedi, JA Fries, M McDermott, NH Shah arXiv preprint arXiv:2409.09095, 2024 | 2 | 2024 |