Address
:
[go:
up one dir
,
main page
]
Include Form
Remove Scripts
Accept Cookies
Show Images
Show Referer
Rotate13
Base64
Strip Meta
Strip Title
Session Cookies
Loading...
The system can't perform the operation now. Try again later.
Citations per year
Duplicate citations
The following articles are merged in Scholar. Their
combined citations
are counted only for the first article.
Merged citations
This "Cited by" count includes citations to the following articles in Scholar. The ones marked
*
may be different from the article in the profile.
Add co-authors
Co-authors
Follow
New articles by this author
New citations to this author
New articles related to this author's research
Email address for updates
Done
My profile
My library
Metrics
Alerts
Settings
Sign in
Sign in
Get my own profile
Cited by
All
Since 2021
Citations
14
14
h-index
2
2
i10-index
1
1
0
12
6
2025
2026
12
2
Public access
View all
View all
1 article
0 articles
available
not available
Based on funding mandates
Co-authors
Flavio du Pin Calmon
Harvard University
Verified email at seas.harvard.edu
Himabindu Lakkaraju
Assistant Professor, Harvard University; Senior Staff Research Scientist, Google.
Verified email at seas.harvard.edu
Follow
Hadi Khalaf
Harvard University
Verified email at g.harvard.edu -
Homepage
ai alignment
ai safety
information theory
Articles
Cited by
Public access
Co-authors
Title
Sort
Sort by citations
Sort by year
Sort by title
Cited by
Cited by
Year
AI Alignment at Your Discretion
M Buyl*, H Khalaf*, C Mayrink Verdun*, L Monteiro Paes*, ...
ACM Conference on Fairness, Accountability, and Transparency
, 2025
11
2025
Inference-Time Reward Hacking in Large Language Models
H Khalaf, CM Verdun, A Oesterling, H Lakkaraju, FP Calmon
NeurIPS 2025
, 2025
3
2025
The system can't perform the operation now. Try again later.
Articles 1–2
Show more
Privacy
Terms
Help
About Scholar
Search help