Video scene parsing: An overview of deep learning methods and datasets

Document Type

Article

Source of Publication

Computer Vision and Image Understanding

Publication Date

12-1-2020

Abstract

© 2020 Video scene parsing (VSP) has become a key problem in the field of computer vision in recent years due to its wide range of applications in numerous domains (e.g., autonomous driving). With the renaissance of deep learning (DL) techniques, various of VSP methods under this framework have demonstrated promising performance. However, no thorough review has been provided to comprehensively summarize the advantages and disadvantages of these methods, their datasets, or the directions for development. To remedy this, we provide an overview of the different DL methods applied to VSP in various scientific and engineering areas. Firstly, we describe several indispensable preliminaries of this field, defining essential background concepts as well as fundamental terminologies and differentiating between VSP and other similar problems. Then, according to their principles, contributions and importance, recent advanced DL methods for VSP are meticulously classified and thoroughly analyzed. Thirdly, we elaborate on the most frequently-used datasets and describe common evaluation metrics for VSP. Besides, extensive of experimental results for the aforementioned methods are presented to demonstrate their advantages and disadvantages. This is followed by further comparisons and discussions on the main challenges faced by researchers. Finally, we sum up the paper by drawing conclusions on the state-of-the-art methods for VSP and highlights potential research orientations as well as promising future work for DL techniques applied to VSP.

ISSN

1077-3142

Publisher

Elsevier BV

Volume

201

First Page

103077

Disciplines

Computer Sciences

Keywords

Deep Learning, overview3, Video Scene Parsing

Indexed in Scopus

no

Open Access

yes

Open Access Type

Bronze: This publication is openly available on the publisher’s website but without an open license

Share

COinS