From Known to the Unknown: Transferring Knowledge to Answer Questions about Novel Visual and Semantic Concepts

Document Type

Article

Source of Publication

Image and Vision Computing

Publication Date

11-1-2020

Abstract

© 2020 Elsevier B.V. Current Visual Question Answering (VQA) systems can answer intelligent questions about ‘known’ visual content. However, their performance drops significantly when questions about visually and linguistically ‘unknown’ concepts are presented during inference (‘Open-world’ scenario). A practical VQA system should be able to deal with novel concepts in real world settings. To address this problem, we propose an exemplar-based approach that transfers learning (i.e., knowledge) from previously ‘known’ concepts to answer questions about the ‘unknown’. We learn a highly discriminative joint embedding (JE) space, where visual and semantic features are fused to give a unified representation. Once novel concepts are presented to the model, it looks for the closest match from an exemplar set in the JE space. This auxiliary information is used alongside the given Image-Question pair to refine visual attention in a hierarchical fashion. Our novel attention model is based on a dual-attention mechanism that combines the complementary effect of spatial and channel attention. Since handling the high dimensional exemplars on large datasets can be a significant challenge, we introduce an efficient matching scheme that uses a compact feature description for search and retrieval. To evaluate our model, we propose a new dataset for VQA, separating unknown visual and semantic concepts from the training set. Our approach shows significant improvements over state-of-the-art VQA models on the proposed Open-World VQA dataset and other standard VQA datasets.

DOI Link

10.1016/j.imavis.2020.103985

ISSN

0262-8856

Publisher

Elsevier BV

Volume

103

First Page

103985

Disciplines

Computer Sciences

Keywords

Computer vision, Dataset bias, Deep learning, Natural language processing, Visual Question Answering

Recommended Citation

Farazi, Moshiur R.; Khan, Salman H.; and Barnes, Nick, "From Known to the Unknown: Transferring Knowledge to Answer Questions about Novel Visual and Semantic Concepts" (2020). All Works. 1729.
https://zuscholars.zu.ac.ae/works/1729

Indexed in Scopus

no

Open Access

yes

Open Access Type

Green: A manuscript of this publication is openly available in a repository

All Works

From Known to the Unknown: Transferring Knowledge to Answer Questions about Novel Visual and Semantic Concepts

Document Type

Source of Publication

Publication Date

Abstract

DOI Link

ISSN

Publisher

Volume

First Page

Disciplines

Keywords

Recommended Citation

Indexed in Scopus

Open Access

Open Access Type

Search

Browse

Contribute

Content Type

All Works

From Known to the Unknown: Transferring Knowledge to Answer Questions about Novel Visual and Semantic Concepts

Author First name, Last name, Institution

Document Type

Source of Publication

Publication Date

Abstract

DOI Link

ISSN

Publisher

Volume

First Page

Disciplines

Keywords

Recommended Citation

Indexed in Scopus

Open Access

Open Access Type

Share

Search

Browse

Contribute

Content Type