A Late Multi-modal Fusion Model for Detecting Hybrid Spam E-mail
Source of Publication
International Journal of Computer Theory and Engineering
In recent years, spammers are now trying to obfuscate spam filtering systems by introducing hybrid spam email combining both image and text parts, which is more destructive and complicated compared to e-mails containing text or image only to cyber security. Traditionally, Optical Character Recognition (OCR) technology is used to eliminate the image parts of spam by transforming images into text. Although OCR scanning is a very successful technique for processing text-and-image hybrid spam, it is not an effective solution for dealing with huge quantities due to the Central Processing Unit (CPU) power required and the execution time it takes to scan e-mail files. To address this problem, this paper proposes a late multi-modal fusion model for a text-and-image hybrid spam e-mail filtering system compared to the classical early fusion detection model based on the OCR method. Convolutional Neural Network (CNN) and Continuous Bag of Words were implemented to extract features from image and text parts of hybrid spam respectively, whereas generated features were fed to the sigmoid layer and machine learning based classifiers to determine the e-mail ham or spam. The obtained two classification probability values were fed to a late decision model and the concluding classification decisions were analyzed with text-only classifiers based on the OCR technique in terms of prediction accuracy as well as computational efficiency. The experimental results show that the proposed late fusion model is highly superior to the benchmark in terms of execution time whereas other performance metrics are adequate. These findings reveal the superiorities of using CNN rather than OCR to detect hybrid spam e-mails.
Medicine and Health Sciences
Convolutional neural network, cyber security, hybrid spam e-mail, late fusion, spam filtering
Zhang, Zhibo; Damiani, Ernesto; Hamadi, Hussam; Yeun, Chan; and Taher, Fatma, "A Late Multi-modal Fusion Model for Detecting Hybrid Spam E-mail" (2023). All Works. 5907.
Indexed in Scopus
Open Access Type
Gold: This publication is openly available in an open access journal/series