Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense (original) (raw)

Outside Knowledge Visual Question Answering Version 2.0

Aaryan Shah

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View PDFchevron_right

SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning

Zhecan Wang

Proceedings of the AAAI Conference on Artificial Intelligence

View PDFchevron_right

Bridging the Gap between Recognition-level Pre-training and Commonsensical Vision-language Tasks

Zhecan Wang

Proceedings of the First Workshop on Commonsense Representation and Reasoning (CSRR 2022)

View PDFchevron_right

Vision Meets Language: Multimodal Transformers Elevating Predictive Power in Visual Question Answering

Ehsanur Rahman Rhythm, Md Humaion Kabir Mehedi, PRIMA SARKER, SAJIDUL ISLAM KHANDAKER

ICCIT, 2023

View PDFchevron_right

XDBERT: Distilling Visual Information to BERT from Cross-Modal Systems to Improve Language Understanding

Yu Tsao

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

View PDFchevron_right

Multimodal Learning for Accurate Visual Question Answering: An Attention-based Approach

Benyamin Ahmadnia, PhD, Anurag Balakrishnan, Satyam Pathak, Jishnu Bhardwaj, Ishan Unnarkar, Aniruddha Gawande

RANLP, 2023

View PDFchevron_right

Just because you are right, doesn't mean I am wrong': Overcoming a Bottleneck in the Development and Evaluation of Open-Ended Visual Question Answering (VQA) Tasks

Akarshan Sajja

2021

View PDFchevron_right

Revisiting Visual Question Answering Baselines

Allan Jabri

Computer Vision – ECCV 2016, 2016

View PDFchevron_right

A Thousand Words Are Worth More Than a Picture: Natural Language-Centric Outside-Knowledge Visual Question Answering

AISHWARYA N Reganti

2022

View PDFchevron_right

Visually Grounded Commonsense Knowledge Acquisition

Stefan Wermter

Proceedings of the AAAI Conference on Artificial Intelligence , Volume 37, Number 5, 2023

View PDFchevron_right

Multi-Image Visual Question Answering

Harsh raj

ArXiv, 2021

View PDFchevron_right

Cross-Modality Relevance for Reasoning on Language and Vision

Parisa Kordjamshidi

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

View PDFchevron_right

Enforcing Reasoning in Visual Commonsense Reasoning

Mehrab Tanjim

ArXiv, 2019

View PDFchevron_right

Towards a Unified Model for Generating Answers and Explanations in Visual Question Answering

Tillman Weyde

arXiv (Cornell University), 2023

View PDFchevron_right

iReason: Multimodal Commonsense Reasoning using Videos and Natural Language with Interpretability

Aman Chadha

ArXiv, 2021

View PDFchevron_right

Select, Substitute, Search: A New Benchmark for Knowledge-Augmented Visual Question Answering

aman jain

Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

View PDFchevron_right

Towards Solving Multimodal Comprehension

pritish sahu

ArXiv, 2021

View PDFchevron_right

Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering

Douglas Summers-Stay

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

View PDFchevron_right

TraVLR: Now You See It, Now You Don't! A Bimodal Dataset for Evaluating Visio-Linguisic Reasoning

Min-Yen Kan

arXiv (Cornell University), 2021

View PDFchevron_right

Visual question answering: Datasets, algorithms, and future challenges

Kushal Kafle

Computer Vision and Image Understanding

View PDFchevron_right

Generating Natural Questions from Images for Multimodal Assistants

Akanksha Bindal

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View PDFchevron_right

Seeing the World through Text: Evaluating Image Descriptions for Commonsense Reasoning in Machine Reading Comprehension

Kentaro Inui

2020

View PDFchevron_right

Component Analysis for Visual Question Answering Architectures

Camila Kolling

2020 International Joint Conference on Neural Networks (IJCNN), 2020

View PDFchevron_right

Visual Entailment: A Novel Task for Fine-Grained Image Understanding

Farley Lai

ArXiv, 2019

View PDFchevron_right

An Analysis of Visual Question Answering Algorithms

Kushal Kafle

2017 IEEE International Conference on Computer Vision (ICCV)

View PDFchevron_right

Multimodal Attention in Recurrent Neural Networks for Visual Question Answering

Lorena Kodra

Global journal of computer science and technology, 2018

View PDFchevron_right