Conferences >2021 IEEE/CVF International C...

Greedy Gradient Ensemble for Robust Visual Question Answering

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Language bias is a critical issue in Visual Question Answering (VQA), where models often exploit dataset biases for the final decision without considering the image infor...Show More

Metadata

Abstract:

Language bias is a critical issue in Visual Question Answering (VQA), where models often exploit dataset biases for the final decision without considering the image information. As a result, they suffer from performance drop on out-of-distribution data and inadequate visual explanation. Based on experimental analysis for existing robust VQA methods, we stress the language bias in VQA that comes from two aspects, i.e., distribution bias and shortcut bias. We further propose a new de-bias framework, Greedy Gradient Ensemble (GGE), which combines multiple biased models for unbiased base model learning. With the greedy strategy, GGE forces the biased models to over-fit the biased data distribution in priority, thus makes the base model pay more attention to examples that are hard to solve by biased models. The experiments demonstrate that our method makes better use of visual information and achieves state-of-the-art performance on diagnosing dataset VQACP without using extra annotations.

Published in: 2021 IEEE/CVF International Conference on Computer Vision (ICCV)

Date of Conference: 10-17 October 2021

Date Added to IEEE Xplore: 28 February 2022

ISBN Information:

ISSN Information:

DOI: 10.1109/ICCV48922.2021.00161

Conference Location: Montreal, QC, Canada

Funding Agency:

Contents

1. Introduction

Visual Question Answering (VQA) is a challenging task that requires both language-aware reasoning and image understanding. With advances in deep learning, neural networks [37], [34], [6], [13], [18], [17], [19], [29] that model the correlations between vision and language have shown remarkable results on large-scale benchmark datasets [3], [15], [23], [20].

References is not available for this document.

MIT Libraries

MIT Libraries

Greedy Gradient Ensemble for Robust Visual Question Answering

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

Greedy Gradient Ensemble for Robust Visual Question Answering

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

1. Introduction

References