Jiasen Lu

Cited by

	All	Since 2021
Citations	24389	18833
h-index	28	27
i10-index	38	35

4600

2300

1150

3450

20162017201820192020202120222023202420252026207 534 1046 1536 2041 2877 3442 3923 3969 4516 105

Public access

View all

11 articles

1 article

available

not available

Based on funding mandates

Co-authors

Devi ParikhPreviously: FAIR and GenAI @ Meta. Georgia TechVerified email at gatech.edu
Dhruv BatraPrev: FAIR (Meta AI), Georgia TechVerified email at dhruvbatra.com
Jianwei YangMember of Technical Staff, xAIVerified email at x.ai
Stefan LeeAssociate Professor, Oregon State UniversityVerified email at oregonstate.edu
Aniruddha KembhaviDirector, WayveVerified email at wayve.ai
Caiming XiongSalesforce ResearchVerified email at salesforce.com
Stanislaw AntolAutonomous Vehicles Software Engineer, Mercedes-Benz R&DVerified email at vt.edu
Christopher ClarkAllen Institute for AIVerified email at allenai.org
Sangho LeeResearch Scientist at the Allen Institute for AIVerified email at allenai.org
Rowan Zellersthinking machinesVerified email at cs.washington.edu
Yinfei YangAppleVerified email at apple.com
Zuxuan WuFudan UniversityVerified email at fudan.edu.cn
Sheng LiQuantitative Foundation Associate Professor of Data Science, University of VirginiaVerified email at virginia.edu
Peng GaoZ-Image Team, Alibaba GroupVerified email at alibaba-inc.com
Chih-Yao MaMember of Technical Staff @ Microsoft AIVerified email at meta.com
Michael CogswellSRI InternationalVerified email at sri.com
Mingze XuAdobe FireflyVerified email at adobe.com
Jaemin ChoYoung Investigator at AI2, Incoming Assistant Professor at Johns Hopkins UniversityVerified email at jhu.edu
Xiao LinSRI InternationalVerified email at sri.com
Ran XuSalesforce ResearchVerified email at salesforce.com

Jiasen Lu

Research Scientist, Apple

Verified email at apple.com - Homepage

Computer Vision Natural Language Processing


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Vqa: Visual question answering A Agrawal, J Lu, S Antol*, M Mitchell, CL Zitnick, D Parikh, D Batra International Journal of Computer Vision 123 (1), 4-31, 2017	8055*	2017
Vqa: Visual question answering S Antol, A Agrawal, J Lu, M Mitchell, D Batra, C Lawrence Zitnick, ... Proceedings of the IEEE International Conference on Computer Vision, 2425-2433, 2015	8032	2015
Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks J Lu, D Batra, D Parikh, S Lee Advances in neural information processing systems, 2019	5223	2019
Hierarchical question-image co-attention for visual question answering J Lu, J Yang, D Batra, D Parikh Advances in neural information processing systems 29, 2016	2249	2016
Knowing when to look: Adaptive attention via a visual sentinel for image captioning J Lu, C Xiong, D Parikh, R Socher Proceedings of the IEEE Conference on Computer Vision and Pattern …, 2017	2125	2017
Graph R-CNN for Scene Graph Generation J Yang, J Lu, S Lee, D Batra, D Parikh arXiv preprint arXiv:1808.00191, 2018	1160	2018
Neural Baby Talk J Lu, J Yang, D Batra, D Parikh In Proceedings of the IEEE conference on computer vision and pattern …, 2018	637	2018
12-in-1: Multi-Task Vision and Language Representation Learning J Lu, V Goswami, M Rohrbach, D Parikh, S Lee Proceedings of the IEEE Conference on Computer Vision and Pattern …, 2019	610	2019
Unified-IO: A unified model for vision, language, and multi-modal tasks J Lu, C Clark, R Zellers, R Mottaghi, A Kembhavi arXiv preprint arXiv:2206.08916, 2022	573	2022
Molmo and pixmo: Open weights and open data for state-of-the-art multimodal models M Deitke, C Clark, S Lee, R Tripathi, Y Yang, JS Park, M Salehi, ... arXiv e-prints, arXiv: 2409.17146, 2024	509*	2024
Parlai: A dialog research software platform A Miller, W Feng, D Batra, A Bordes, A Fisch, J Lu, D Parikh, J Weston Proceedings of the 2017 conference on empirical methods in natural language …, 2017	483	2017
Self-monitoring navigation agent via auxiliary progress estimation CY Ma, J Lu, Z Wu, G AlRegib, Z Kira, R Socher, C Xiong arXiv preprint arXiv:1901.03035, 2019	353	2019
Merlot reserve: Neural script knowledge through vision and language and sound R Zellers, J Lu, X Lu, Y Yu, Y Zhao, M Salehi, A Kusupati, J Hessel, ... Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022	332	2022
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision Language Audio and Action J Lu, C Clark, S Lee, Z Zhang, S Khosla, R Marten, D Hoiem, A Kembhavi Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024	290	2024
Multi-modal answer validation for knowledge-based vqa J Wu, J Lu, A Sabharwal, R Mottaghi Proceedings of the AAAI conference on artificial intelligence 36 (3), 2712-2721, 2022	199	2022
Sentinel gate for modulating auxiliary information in a long short-term memory (lstm) neural network LU Jiasen, C Xiong, R Socher US Patent 10,565,306, 2020	162	2020
Best of both worlds: Transferring knowledge from discriminative learning to a generative visual dialog model J Lu, A Kannan, J Yang, D Parikh, D Batra Advances in Neural Information Processing Systems 30, 2017	159	2017
Container: Context aggregation network P Gao, J Lu, H Li, R Mottaghi, A Kembhavi arXiv preprint arXiv:2106.01401, 2021	145*	2021
X-lxmert: Paint, caption and answer questions with multi-modal transformers J Cho, J Lu, D Schwenk, H Hajishirzi, A Kembhavi arXiv preprint arXiv:2009.11278, 2020	135	2020
Adaptive attention model for image captioning LU Jiasen, C Xiong, R Socher US Patent 10,565,305, 2020	126	2020

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors