[go: up one dir, main page]

Follow
Hanan Gani
Hanan Gani
University of California San Diego; Mohamed Bin Zayed University of Artificial Intelligence
Verified email at mbzuai.ac.ae - Homepage
Title
Cited by
Cited by
Year
Align your prompts: Test-time prompting with distribution alignment for zero-shot generalization
J Abdul Samadh, MH Gani, N Hussein, MU Khattak, MM Naseer, ...
Advances in Neural Information Processing Systems 36, 80396-80413, 2023
1372023
How to Train Vision Transformer on Small-scale Datasets?
H Gani, M Naseer, M Yaqub
33rd British Machine Vision Conference (BMVC) 2022, 2022
902022
LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts
H Gani, SF Bhat, M Naseer, S Khan, P Wonka
Twelfth International Conference on Learning Representations (ICLR) 2024, 2024
602024
VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
S Munasinghe, H Gani, W Zhu, J Cao, E Xing, FS Khan, S Khan
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025, 2025
272025
Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Vision-Language Models
R Imam, H Gani, M Huzaifa, K Nandakumar
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025, 2025
202025
A Supervised Learning Methodology for Real-Time Disguised Face Recognition in the Wild
S Kumaar, A Dogra, A Majeedi, H Gani, RM Vishwanath, SN Omkar
arXiv preprint arXiv:1809.02875, 2018
17*2018
Vane-bench: Video anomaly evaluation benchmark for conversational lmms
H Gani, R Bharadwaj, M Naseer, FS Khan, S Khan
Findings of the Association for Computational Linguistics: NAACL 2025, 3123-3140, 2025
13*2025
AgriCLIP: Adapting CLIP for Agriculture and Livestock via Domain-Specialized Cross-Model Alignment
U Nawaz, M Awais, H Gani, M Naseer, F Khan, S Khan, RM Anwer
International Conference on Computational Linguistics (COLING), 2025
132025
Aurelia: Test-time reasoning distillation in audio-visual llms
S Chowdhury, H Gani, N Anand, S Nag, R Gao, M Elhoseiny, S Khan, ...
IEEE/CVF International Conference on Computer Vision (ICCV) 2025, 2025
62025
Agent-X: Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks
T Ashraf, A Saqib, H Ghani, M AlMahri, Y Li, N Ahsan, U Nawaz, J Lahoud, ...
arXiv preprint arXiv:2505.24876, 2025
42025
VideoMolmo: Spatio-Temporal Grounding Meets Pointing
GS Ahmad, A Heakl, H Gani, A Shaker, Z Shen, FS Khan, S Khan
arXiv preprint arXiv:2506.05336, 2025
22025
Multi-Attribute Vision Transformers are Efficient and Robust Learners
H Gani, N Saadi, N Hussein, K Nandakumar
IEEE International Conference on Image Processing (ICIP), 2024
22024
MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation
H Gani, M Naseer, F Khan, S Khan
Medical Image Computing and Computer Assisted Intervention (MICCAI), 2024
12024
System and method of training vision transformer on small-scale datasets
MH GANI, MM NASEER, M YAQUB
US Patent 12,417,620, 2025
2025
VideoMolmo: Spatio-Temporal Grounding Meets Pointing
G Shazan Ahmad, A Heakl, H Gani, A Shaker, Z Shen, R Krishna, ...
arXiv e-prints, arXiv: 2506.05336, 2025
2025
Text-to-Image Diffusion with Complex and Detailed Prompts
H Gani
MS Thesis, Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI), 2024
2024
Analyzing the Robustness and the Reliability of Large Language Models
H Gani, R Bharadwaj, M Huzaifa
The system can't perform the operation now. Try again later.
Articles 1–17