All Works

Deepfake Audio Detection via MFCC Features Using Machine Learning

Ameer Hamza, Air University
Abdul Rehman Rehman Javed, Air University; Lebanese American University
Farkhund Iqbal, Zayed University
Natalia Kryvinska, Comenius University
Ahmad S. Almadhor, Al Jouf University
Zunera Jalil, Air University
Rouba Borghol, Rochester Institute of Technology - Dubai

Document Type

Article

Source of Publication

IEEE Access

Publication Date

12-21-2022

Abstract

Deepfake content is created or altered synthetically using artificial intelligence (AI) approaches to appear real. It can include synthesizing audio, video, images, and text. Deepfakes may now produce natural-looking content, making them harder to identify. Much progress has been achieved in identifying video deepfakes in recent years; nevertheless, most investigations in detecting audio deepfakes have employed the ASVSpoof or AVSpoof dataset and various machine learning, deep learning, and deep learning algorithms. This research uses machine and deep learning-based approaches to identify deepfake audio. Mel-frequency cepstral coefficients (MFCCs) technique is used to acquire the most useful information from the audio. We choose the Fake-or-Real dataset, which is the most recent benchmark dataset. The dataset was created with a text-to-speech model and is divided into four sub-datasets: for-rece, for-2-sec, for-norm and for-original. These datasets are classified into sub-datasets mentioned above according to audio length and bit rate. The experimental results show that the support vector machine (SVM) outperformed the other machine learning (ML) models in terms of accuracy on for-rece and for-2-sec datasets, while the gradient boosting model performed very well using for-norm dataset. The VGG-16 model produced highly encouraging results when applied to the for-original dataset. The VGG-16 model outperforms other state-of-the-art approaches.

DOI Link

10.1109/access.2022.3231480

ISSN

2169-3536

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Volume

First Page

134018

Last Page

134028

Disciplines

Computer Sciences

Keywords

Deepfakes, Deep learning, Speech synthesis, Training data, Feature extraction, Machine learning algorithms, Data models, Acoustics

Recommended Citation

Hamza, Ameer; Javed, Abdul Rehman Rehman; Iqbal, Farkhund; Kryvinska, Natalia; Almadhor, Ahmad S.; Jalil, Zunera; and Borghol, Rouba, "Deepfake Audio Detection via MFCC Features Using Machine Learning" (2022). All Works. 5544.
https://zuscholars.zu.ac.ae/works/5544

Indexed in Scopus

Open Access

Link to Full Text

COinS

All Works

Deepfake Audio Detection via MFCC Features Using Machine Learning

Document Type

Source of Publication

Publication Date

Abstract

DOI Link

ISSN

Publisher

Volume

First Page

Last Page

Disciplines

Keywords

Recommended Citation

Indexed in Scopus

Open Access

Search

Browse

Contribute

Content Type

All Works

Deepfake Audio Detection via MFCC Features Using Machine Learning

Author First name, Last name, Institution

Document Type

Source of Publication

Publication Date

Abstract

DOI Link

ISSN

Publisher

Volume

First Page

Last Page

Disciplines

Keywords

Recommended Citation

Indexed in Scopus

Open Access

Share

Search

Browse

Contribute

Content Type