All Works

Dynamic Facial Dataset Capture and Processing for Visual Speech Recognition using an RGB-D Sensor

Naveed Ahmed, University of Sharjah
Mohammed Lataifeh, University of Sharjah
Imran Junejo, Zayed University

Document Type

Article

Source of Publication

IAENG International Journal of Computer Science

Publication Date

11-23-2020

Abstract

This work presents a new RGB-D acquisition system to capture a comprehensive dynamic facial dataset that can be used for visual speech recognition. The RGB-D facial dataset acquisition system uses a Kinect to record detailed facial features of a person. The dynamic facial dataset is comprised of the facial data of 20 individuals saying 20 common English words or phrases. The acquisition system employs Kinect facial tracking, which records a large number of dynamic facial features. These features include: facial points, facial outline, RGB data, depth data, mapping between RGB and depth data, facial animation units, facial shape units, and finally 2D and 3D face representations of the face along with the 3D head orientation. The effectiveness of acquired RGB-D dynamic facial dataset is demonstrated by presenting a new visual speech recognition method that employs three-dimensional spatiotemporal data of different facial feature points. A number of visual speech recognition methods from the literature are also tested on the new dataset and they obtain a comparable or favorable visual speech recognition results. The results demonstrate the effectiveness of the proposed RGB-D dynamic facial dataset and show that it can be effectively employed in a visual speech recognition system.

ISSN

1819-9224

Volume

Issue

First Page

786

Last Page

791

Disciplines

Computer Sciences

Keywords

Facial Dataset, Facial Tracking, Kinect, RGB-D, Visual Speech Recognition

Scopus ID

85102344356

Recommended Citation

Ahmed, Naveed; Lataifeh, Mohammed; and Junejo, Imran, "Dynamic Facial Dataset Capture and Processing for Visual Speech Recognition using an RGB-D Sensor" (2020). All Works. 4107.
https://zuscholars.zu.ac.ae/works/4107

Indexed in Scopus

yes

Open Access

yes

Open Access Type

Bronze: This publication is openly available on the publisher’s website but without an open license

Link to Full Text

COinS

All Works

Dynamic Facial Dataset Capture and Processing for Visual Speech Recognition using an RGB-D Sensor

Document Type

Source of Publication

Publication Date

Abstract

ISSN

Volume

Issue

First Page

Last Page

Disciplines

Keywords

Scopus ID

Recommended Citation

Indexed in Scopus

Open Access

Open Access Type

Search

Browse

Contribute

Content Type

All Works

Dynamic Facial Dataset Capture and Processing for Visual Speech Recognition using an RGB-D Sensor

Author First name, Last name, Institution

Document Type

Source of Publication

Publication Date

Abstract

ISSN

Volume

Issue

First Page

Last Page

Disciplines

Keywords

Scopus ID

Recommended Citation

Indexed in Scopus

Open Access

Open Access Type

Share

Search

Browse

Contribute

Content Type