[go: up one dir, main page]

Follow
Ge Yan
Ge Yan
Verified email at ucsd.edu - Homepage
Title
Cited by
Cited by
Year
Vlg-cbm: Training concept bottleneck models with vision-language guidance
D Srivastava, G Yan, L Weng
Advances in Neural Information Processing Systems 37, 79057-79094, 2024
432024
Provably robust conformal prediction with improved efficiency
G Yan, Y Romano, TW Weng
arXiv preprint arXiv:2404.19651, 2024
282024
ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models
CE Sun, G Yan, TW Weng
arXiv preprint arXiv:2503.22048, 2025
72025
Interpretable Generative Models through Post-hoc Concept Bottlenecks
A Kulkarni, G Yan, CE Sun, T Oikarinen, TW Weng
Proceedings of the Computer Vision and Pattern Recognition Conference, 8162-8171, 2025
72025
Evaluating neuron explanations: A unified framework with sanity checks
T Oikarinen, G Yan, TW Weng
arXiv preprint arXiv:2506.05774, 2025
52025
Rethinking Crowd-Sourced Evaluation of Neuron Explanations
T Oikarinen, G Yan, A Kulkarni, TW Weng
arXiv preprint arXiv:2506.07985, 2025
12025
Faithful and Stable Neuron Explanations for Trustworthy Mechanistic Interpretability
G Yan, T Oikarinen
arXiv preprint arXiv:2512.18092, 2025
2025
ReflCtrl: Controlling LLM Reflection via Representation Engineering
G Yan, CE Sun
arXiv preprint arXiv:2512.13979, 2025
2025
ReFIne: A Framework for Trustworthy Large Reasoning Models with Reliability, Faithfulness, and Interpretability
CE Sun, G Yan, A Kulkarni, TW Weng
arXiv preprint arXiv:2510.09062, 2025
2025
RAT: Boosting Misclassification Detection Ability without Extra Data
G Yan, TW Weng
arXiv preprint arXiv:2503.14783, 2025
2025
Multimodal Concept Bottleneck Models
T Shi, G Yan, T Oikarinen, TW Weng
Mechanistic Interpretability Workshop at NeurIPS 2025, 0
A Principled Evaluation Framework for Neuron Explanations
T Oikarinen, G Yan, TW Weng
The system can't perform the operation now. Try again later.
Articles 1–12