Address
:
[go:
up one dir
,
main page
]
Include Form
Remove Scripts
Accept Cookies
Show Images
Show Referer
Rotate13
Base64
Strip Meta
Strip Title
Session Cookies
Loading...
The system can't perform the operation now. Try again later.
Citations per year
Duplicate citations
The following articles are merged in Scholar. Their
combined citations
are counted only for the first article.
Merged citations
This "Cited by" count includes citations to the following articles in Scholar. The ones marked
*
may be different from the article in the profile.
Add co-authors
Co-authors
Follow
New articles by this author
New citations to this author
New articles related to this author's research
Email address for updates
Done
My profile
My library
Metrics
Alerts
Settings
Sign in
Sign in
Get my own profile
Cited by
All
Since 2021
Citations
317
314
h-index
2
2
i10-index
2
2
0
220
110
55
165
2023
2024
2025
2026
3
93
204
10
Co-authors
Stephen Casper
PhD student, MIT
Verified email at mit.edu
Javier Rando
Anthropic
Verified email at anthropic.com
Arush Tagade
PhD Student, George Washington University
Verified email at gwu.edu
Rusheb Shah
Apollo Research
Verified email at apolloresearch.ai
Follow
Soroush Pour
Harmony Intelligence
Verified email at soroushjp.com -
Homepage
AI safety
AI alignment
Large Language Models
Transformers
Articles
Cited by
Co-authors
Title
Sort
Sort by citations
Sort by year
Sort by title
Cited by
Cited by
Year
Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation
R Shah, Q Feuillade--Montixi, S Pour, A Tagade, S Casper, J Rando
arXiv preprint arXiv:2311.03348
, 2023
184
2023
The AI Risk Repository: A Comprehensive Meta-Review, Database, and Taxonomy of Risks From Artificial Intelligence
P Slattery, AK Saeri, EAC Grundy, J Graham, M Noetel, R Uuk, J Dao, ...
arXiv preprint arXiv:2408.12622
, 2024
133
2024
The system can't perform the operation now. Try again later.
Articles 1–2
Show more
Privacy
Terms
Help
About Scholar
Search help