[go: up one dir, main page]

Follow
Joe Kwon
Joe Kwon
Verified email at csail.mit.edu
Title
Cited by
Cited by
Year
Scaling out-of-distribution detection for real-world settings
D Hendrycks, S Basart, M Mazeika, A Zou, J Kwon, M Mostajabi, ...
arXiv preprint arXiv:1911.11132, 2019
7332019
Explore, establish, exploit: Red teaming language models from scratch
S Casper, J Lin, J Kwon, G Culp, D Hadfield-Menell
arXiv preprint arXiv:2306.09442, 2023
1392023
Forecasting future world events with neural networks
A Zou, T Xiao, R Jia, J Kwon, M Mazeika, R Li, D Song, J Steinhardt, ...
Advances in Neural Information Processing Systems 35, 27293-27305, 2022
512022
Social inferences from physical evidence via bayesian event reconstruction.
M Lopez-Brau, J Kwon, J Jara-Ettinger
Journal of Experimental Psychology: General 151 (9), 2029, 2022
222022
Large Language Models Are More Persuasive Than Incentivized Human Persuaders
P Schoenegger, F Salvi, J Liu, X Nan, R Debnath, B Fasolo, E Leivada, ...
arXiv preprint arXiv:2505.09662, 2025
172025
When it is not out of line to get out of line: The role of universalization and outcome-based reasoning in rule-breaking judgments
J Kwon, T Zhi-Xuan, J Tenenbaum, S Levine
Proceedings of the Annual Meeting of the Cognitive Science Society 45, 2023
152023
Neuro-symbolic models of human moral judgment: LLMs as automatic feature extractors
J Kwon, S Levine, JB Tenenbaum
102023
Flexibility in Moral Cognition: When is it okay to break the rules?
J Kwon, J Tenenbaum, S Levine
Proceedings of the annual meeting of the cognitive science society 44 (44), 2022
92022
Comparing bottom-up and top-down steering approaches on in-context learning tasks
M Brumley, J Kwon, D Krueger, D Krasheninnikov, U Anwar
arXiv preprint arXiv:2411.07213, 2024
62024
Mental state inference from indirect evidence through Bayesian eventreconstruction
M Lopez-Brau, J Kwon, J Jara-Ettinger
Proceedings of the Annual Meeting of the Cognitive Science Society 42, 2020
62020
Neuro-symbolic models of human moral judgment
J Kwon, J Tenenbaum, S Levine
Proceedings of the Annual Meeting of the Cognitive Science Society 46, 2024
32024
Improving and assessing anomaly detectors for large-scale settings
D Hendrycks, S Basart, M Mazeika, A Zou, J Kwon, M Mostajabi, ...
32022
When it's not out of line to get out of line: Principles of universalizability, welfare, and harm
J Kwon, T Zhi-Xuan, J Tenenbaum, S Levine
Proceedings of the Annual Meeting of the Cognitive Science Society 45 (45), 2023
12023
Internal Deployment Gaps in AI Regulation
J Kwon, S Casper
arXiv preprint arXiv:2601.08005, 2026
2026
Detecting the involvement of agents through physical reasoning
M Lopez-Brau, J Kwon, B McBean, I Yildirim, J Jara-Ettinger
Proceedings of the Annual Meeting of the Cognitive Science Society 43 (43), 2021
2021
Lift-the-flap: what, where and when for context reasoning
M Zhang, C Tseng, K Montejo, J Kwon, G Kreiman
arXiv preprint arXiv:1902.00163, 2019
2019
Does It Know?: Probing and Benchmarking Uncertainty in Language Model Latent Beliefs
BRY Huang, J Kwon
The system can't perform the operation now. Try again later.
Articles 1–17