PhishOut: PhishOut: Effective Phishing Detection Using Selected Features

Document Type

Conference Proceeding

Source of Publication

Proceedings of the 2020 27th International Conference on Telecommunications, ICT 2020

Publication Date

10-5-2020

Abstract

Phishing emails are the first step for many of today’s attacks. They come with a simple hyperlink, request for action or a full replica of an existing service or website. The goal is generally to trick the user to voluntarily give away his sensitive information such as login credentials. Many approaches and applications have been proposed and developed to catch and filter phishing emails. However, the problem still lacks a complete and comprehensive solution. In this paper, we apply knowledge discovery principles from data cleansing, integration, selection, aggregation, data mining to knowledge extraction. We study the feature effectiveness based on Information Gain and contribute two new features to the literature. We compare six machine-learning approaches to detect phishing based on a small number of carefully chosen features. We calculate false positives, false negatives, mean absolute error, recall, precision and F-measure and achieve very low false positive and negative rates. Naive Bayes has the least true positives rate and overall Neural Networks holds the most promise for accurate phishing detection with accuracy of 99.4%.

ISBN

978-1-7281-6587-5

Publisher

Institute of Electrical and Electronics Engineers Inc.

Last Page

5

Disciplines

Computer Sciences

Keywords

Computer crime, Electronic mail, Feature extraction, Filtration, Hypertext systems, False negatives, Information gain, Knowledge extraction, Machine learning approaches, Mean absolute error, Negative rates, Phishing detections, Sensitive informations, Data mining

Scopus ID

85099595688

Indexed in Scopus

yes

Open Access

yes

Open Access Type

Green: A manuscript of this publication is openly available in a repository

Share

COinS