[go: up one dir, main page]

Follow
Esin Durmus
Esin Durmus
Anthropic
Verified email at anthropic.com - Homepage
Title
Cited by
Cited by
Year
On the opportunities and risks of foundation models
R Bommasani
arXiv preprint arXiv:2108.07258, 2021
82852021
Holistic evaluation of language models
P Liang, R Bommasani, T Lee, D Tsipras, D Soylu, M Yasunaga, Y Zhang, ...
arXiv preprint arXiv:2211.09110, 2022
22312022
Benchmarking large language models for news summarization
T Zhang, F Ladhak, E Durmus, P Liang, K McKeown, TB Hashimoto
Transactions of the Association for Computational Linguistics 12, 39-57, 2024
8342024
Whose opinions do language models reflect?
S Santurkar, E Durmus, F Ladhak, C Lee, P Liang, T Hashimoto
International Conference on Machine Learning, 29971-30004, 2023
8212023
Scaling monosemanticity: Extracting interpretable features from claude 3 sonnet
A Templeton
Anthropic, 2024
6222024
Towards understanding sycophancy in language models
M Sharma, M Tong, T Korbak, D Duvenaud, A Askell, SR Bowman, ...
arXiv preprint arXiv:2310.13548, 2023
6042023
Easily accessible text-to-image generation amplifies demographic stereotypes at large scale
F Bianchi, P Kalluri, E Durmus, F Ladhak, M Cheng, D Nozza, ...
Proceedings of the 2023 ACM conference on fairness, accountability, and …, 2023
5662023
FEQA: A question answering evaluation framework for faithfulness assessment in abstractive summarization
E Durmus, H He, M Diab
ACL, 2020
5342020
Towards measuring the representation of subjective global opinions in language models
E Durmus, K Nguyen, TI Liao, N Schiefer, A Askell, A Bakhtin, C Chen, ...
arXiv preprint arXiv:2306.16388, 2023
3402023
Marked personas: Using natural language prompts to measure stereotypes in language models
M Cheng, E Durmus, D Jurafsky
arXiv preprint arXiv:2305.18189, 2023
3012023
Measuring faithfulness in chain-of-thought reasoning
T Lanham, A Chen, A Radhakrishnan, B Steiner, C Denison, ...
arXiv preprint arXiv:2307.13702, 2023
2842023
WikiLingua: A new benchmark dataset for cross-lingual abstractive summarization
F Ladhak, E Durmus, C Cardie, K McKeown
arXiv preprint arXiv:2010.03093, 2020
2842020
Many-shot Jailbreaking
C Anil, E Durmus, M Sharma, J Benton, S Kundu, J Batson, N Rimsky, ...
2662024
Studying large language model generalization with influence functions
R Grosse, J Bae, C Anil, N Elhage, A Tamkin, A Tajdini, B Steiner, D Li, ...
arXiv preprint arXiv:2308.03296, 2023
2612023
The gem benchmark: Natural language generation, its evaluation and metrics
S Gehrmann, T Adewumi, K Aggarwal, PS Ammanamanchi, A Aremu, ...
Proceedings of the 1st Workshop on Natural Language Generation, Evaluation …, 2021
2142021
On the opportunities and risks of foundation models (2021)
R Bommasani, DA Hudson, E Adeli, R Altman, S Arora, S von Arx, ...
arXiv preprint arXiv:2108.07258 10, 2022
1992022
Evaluating human-language model interaction
M Lee, M Srivastava, A Hardy, J Thickstun, E Durmus, A Paranjape, ...
arXiv preprint arXiv:2212.09746, 2022
1862022
Collective constitutional ai: Aligning a language model with public input
D Ganguli, S Huang, L Lovitt, D Siddarth, T Liao, A Askell, Y Bai, ...
Anthropic, 2024
143*2024
Evaluating and mitigating discrimination in language model decisions
A Tamkin, A Askell, L Lovitt, E Durmus, N Joseph, S Kravec, K Nguyen, ...
arXiv preprint arXiv:2312.03689, 2023
1192023
Question decomposition improves the faithfulness of model-generated reasoning
A Radhakrishnan, K Nguyen, A Chen, C Chen, C Denison, D Hernandez, ...
arXiv preprint arXiv:2307.11768, 2023
111*2023
The system can't perform the operation now. Try again later.
Articles 1–20