Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense (original) (raw)
Related papers
Outside Knowledge Visual Question Answering Version 2.0
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning
Proceedings of the AAAI Conference on Artificial Intelligence
Bridging the Gap between Recognition-level Pre-training and Commonsensical Vision-language Tasks
Proceedings of the First Workshop on Commonsense Representation and Reasoning (CSRR 2022)
Ehsanur Rahman Rhythm, Md Humaion Kabir Mehedi, PRIMA SARKER, SAJIDUL ISLAM KHANDAKER
ICCIT, 2023
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Multimodal Learning for Accurate Visual Question Answering: An Attention-based Approach
Benyamin Ahmadnia, PhD, Anurag Balakrishnan, Satyam Pathak, Jishnu Bhardwaj, Ishan Unnarkar, Aniruddha Gawande
RANLP, 2023
2021
Revisiting Visual Question Answering Baselines
Computer Vision – ECCV 2016, 2016
2022
Visually Grounded Commonsense Knowledge Acquisition
Proceedings of the AAAI Conference on Artificial Intelligence , Volume 37, Number 5, 2023
Multi-Image Visual Question Answering
ArXiv, 2021
Cross-Modality Relevance for Reasoning on Language and Vision
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
Enforcing Reasoning in Visual Commonsense Reasoning
ArXiv, 2019
Towards a Unified Model for Generating Answers and Explanations in Visual Question Answering
arXiv (Cornell University), 2023
iReason: Multimodal Commonsense Reasoning using Videos and Natural Language with Interpretability
ArXiv, 2021
Select, Substitute, Search: A New Benchmark for Knowledge-Augmented Visual Question Answering
Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021
Towards Solving Multimodal Comprehension
ArXiv, 2021
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
TraVLR: Now You See It, Now You Don't! A Bimodal Dataset for Evaluating Visio-Linguisic Reasoning
arXiv (Cornell University), 2021
Visual question answering: Datasets, algorithms, and future challenges
Computer Vision and Image Understanding
Generating Natural Questions from Images for Multimodal Assistants
ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
2020
Component Analysis for Visual Question Answering Architectures
2020 International Joint Conference on Neural Networks (IJCNN), 2020
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
ArXiv, 2019
An Analysis of Visual Question Answering Algorithms
2017 IEEE International Conference on Computer Vision (ICCV)
Multimodal Attention in Recurrent Neural Networks for Visual Question Answering
Global journal of computer science and technology, 2018