Incorporating External Information For Visual Question Answering


Download Incorporating External Information For Visual Question Answering PDF/ePub or read online books in Mobi eBooks. Click Download or Read Online button to get Incorporating External Information For Visual Question Answering book now. This website allows unlimited access to, at the time of writing, more than 1.5 million titles, including hundreds of thousands of titles in various foreign languages.

Download

Incorporating External Information for Visual Question Answering


Incorporating External Information for Visual Question Answering

Author: Jialin Wu (Ph. D.)

language: en

Publisher:

Release Date: 2022


DOWNLOAD





Visual question answering (VQA) has recently emerged as a challenging multi-modal task and has gained popularity. The goal is to answer questions that query information associated with the visual content in the given image. Since the required information could be from both inside and outside the image, common types of visual features, such as object and attribute detection, fail to provide enough materials for answering the questions. External information, such as captions, explanations, encyclopedia articles, and commonsense databases, can help VQA systems comprehensively understand the image, reason following the right path, and access external facts. Specifically, they provide concise descriptions of the image, precise reasons for the correct answer, and factual knowledge beyond the image. In this dissertation, we present our work on generating image captions that are targeted to help answer a specific visual question. We use explanations to recognize the critical objects to prevent the VQA models from taking language prior shortcuts. We introduce an approach that generates textual explanations and utilizes them to determine which answer is mostly supported. At last, we explore retrieving and exploiting external knowledge beyond the visual content, which is indispensable, to help answer knowledge-based visual questions

Visual Question Answering


Visual Question Answering

Author: Qi Wu

language: en

Publisher: Springer Nature

Release Date: 2022-05-13


DOWNLOAD





Visual Question Answering (VQA) usually combines visual inputs like image and video with a natural language question concerning the input and generates a natural language answer as the output. This is by nature a multi-disciplinary research problem, involving computer vision (CV), natural language processing (NLP), knowledge representation and reasoning (KR), etc. Further, VQA is an ambitious undertaking, as it must overcome the challenges of general image understanding and the question-answering task, as well as the difficulties entailed by using large-scale databases with mixed-quality inputs. However, with the advent of deep learning (DL) and driven by the existence of advanced techniques in both CV and NLP and the availability of relevant large-scale datasets, we have recently seen enormous strides in VQA, with more systems and promising results emerging. This book provides a comprehensive overview of VQA, covering fundamental theories, models, datasets, and promising future directions. Given its scope, it can be used as a textbook on computer vision and natural language processing, especially for researchers and students in the area of visual question answering. It also highlights the key models used in VQA.

Proceedings of International Conference on Advances in Computer Engineering and Communication Systems


Proceedings of International Conference on Advances in Computer Engineering and Communication Systems

Author: C. Kiran Mai

language: en

Publisher: Springer Nature

Release Date: 2021-01-22


DOWNLOAD





This book comprises the best deliberations with the theme “Smart Innovations in Mezzanine Technologies, Data Analytics, Networks and Communication Systems” in the “International Conference on Advances in Computer Engineering and Communication Systems (ICACECS 2020)”, organized by the Department of Computer Science and Engineering, VNR Vignana Jyothi Institute of Engineering and Technology. The book provides insights on the recent trends and developments in the field of computer science with a special focus on the mezzanine technologies and creates an arena for collaborative innovation. The book focuses on advanced topics in artificial intelligence, machine learning, data mining and big data computing, cloud computing, Internet on things, distributed computing and smart systems.