Depthwise Separable Convolutional Neural Networks for Pedestrian Attribute Recognition

Document Type

Article

Source of Publication

SN Computer Science

Publication Date

2-14-2021

Abstract

Video surveillance is ubiquitous. In addition to understanding various scene objects, extracting human visual attributes from the scene has attracted tremendous traction over the past many years. This is a challenging problem even for human observers. This is a multi-label problem, i.e., a subject in a scene can have multiple attributes that we are hoping to recognize, such as shoes types, clothing type, wearing some accessory, or carrying some object or not, etc. Solutions have been presented over the years and many researchers have employed convolutional neural networks (CNNs). In this work, we propose using Depthwise Separable Convolution Neural Network (DS-CNN) to solve the pedestrian attribute recognition problem. The network employs depthwise separable convolution layers (DSCL), instead of the regular 2D convolution layers. DS-CNN performs extremely well, especially with smaller datasets. In addition, with a compact network, DS-CNN reduces the number of trainable parameters while making learning efficient. We evaluated our method on two benchmark pedestrian datasets and results show improvements over the state of the art.

ISSN

2662-995x

Publisher

Springer Science and Business Media LLC

Volume

2

Issue

2

Disciplines

Computer Sciences

Scopus ID

85121380436

Indexed in Scopus

yes

Open Access

no

Share

COinS