[go: up one dir, main page]

Follow
Theodore Sumers
Theodore Sumers
Anthropic
Verified email at anthropic.com - Homepage
Title
Cited by
Cited by
Year
Scaling monosemanticity: Extracting interpretable features from Claude 3 Sonnet
A Templeton, T Conerly, J Marcus, J Lindsey, T Bricken, B Chen, ...
Anthropic, 2024
6392024
Cognitive architectures for language agents
TR Sumers*, S Yao*, K Narasimhan, TL Griffiths
Transactions on Machine Learning Research, 2023
5022023
Shared functional specialization in transformer-based language models and the human brain
S Kumar*, TR Sumers*, T Yamakoshi, A Goldstein, U Hasson, KA Norman, ...
Nature communications 15 (1), 5523, 2024
123*2024
Constitutional classifiers: Defending against universal jailbreaks across thousands of hours of red teaming
M Sharma, M Tong, J Mu, J Wei, J Kruthoff, S Goodfriend, E Ong, A Peng, ...
arXiv preprint arXiv:2501.18837, 2025
892025
Learning rewards from linguistic feedback
TR Sumers, MK Ho, RD Hawkins, K Narasimhan, TL Griffiths
Proceedings of the AAAI Conference on Artificial Intelligence 35 (7), 6002-6010, 2021
702021
Complex cognitive algorithms preserved by selective social learning in experimental populations
B Thompson, B Van Opheusden, T Sumers, TL Griffiths
Science 376 (6588), 95-98, 2022
662022
Reconciling truthfulness and relevance as epistemic and decision-theoretic utility.
TR Sumers, MK Ho, TL Griffiths, RD Hawkins
Psychological Review, 2023
59*2023
Words are all you need? Language as an approximation for human similarity judgments
R Marjieh, P Van Rijn, I Sucholutsky, T Sumers, H Lee, TL Griffiths, ...
The Eleventh International Conference on Learning Representations, 2023
57*2023
Clio: Privacy-preserving insights into real-world ai use
A Tamkin, M McCain, K Handa, E Durmus, L Lovitt, A Rathi, S Huang, ...
arXiv preprint arXiv:2412.13678, 2024
482024
Simplifying GPS data for map building and distance calculation
S Cui, TD Nguyen, TR Sumers, M Yu, X Zhang
US Patent 9,939,276, 2018
442018
How do Large Language Models Navigate Conflicts between Honesty and Helpfulness?
R Liu*, TR Sumers*, I Dasgupta, TL Griffiths
ICML, 2024
422024
How to talk so AI will learn: Instructions, descriptions, and autonomy
T Sumers, R Hawkins, MK Ho, T Griffiths, D Hadfield-Menell
Advances in neural information processing systems 35, 34762-34775, 2022
41*2022
Distilling Internet-Scale Vision-Language Models into Embodied Agents
T Sumers, K Marino, A Ahuja, R Fergus, I Dasgupta
International Conference on Machine Learning, 2023
402023
Show or tell? Exploring when (and why) teaching with language outperforms demonstration
TR Sumers, MK Ho, RD Hawkins, TL Griffiths
Cognition 232, 105326, 2023
34*2023
Pickup location selection and augmented reality navigation
J Badalamenti, J Inch, CM Sanchez, TR Sumers
US Patent 10,508,925, 2019
342019
Network computer system for analyzing driving actions of drivers on road segments of a geographic region
TR Sumers
US Patent 10,297,148, 2019
312019
Scaling monosemanticity: Extracting interpretable features from claude 3 sonnet, 2024
A Templeton, T Conerly, J Marcus, J Lindsey, T Bricken, B Chen, ...
URL https://transformer-circuits. pub/2024/scaling-monosemanticity 1, 2024
272024
Trip termination determination for on-demand transport
K Brinig, M Ioffe, B Layton, T Sumers, MW Kadous
US Patent 10,672,198, 2020
272020
Cascaded boosted predictive models
D Purdy, L Chen, TR Sumers
US Patent 11,138,524, 2021
262021
Augmented reality assisted pickup
J Badalamenti, J Inch, CM Sanchez, TR Sumers
US Patent 10,423,834, 2019
242019
The system can't perform the operation now. Try again later.
Articles 1–20