Document Type

Article

Source of Publication

IEEE Access

Publication Date

1-1-2021

Abstract

There has been much recent work on fraud and Anti Money Laundering (AML) detection using machine learning techniques. However, most algorithms are based on supervised techniques. Studies show that supervised techniques often have the limitation of not adapting well to new irregular fraud patterns when the dataset is highly imbalanced. Instead, unsupervised learning can have a better capability to find anomalous and irregular patterns in new transaction. Despite this, unsupervised techniques also have the disadvantage of not being able to give state-of-the-art detection results. We propose a suite of unsupervised and deep learning techniques to implement an anti-money laundering and fraud detection system to resolve this limitation. The system leverages three deep learning models: autoencoder (AE), variational autoencoder (VAE), and a generative adversarial network. We preprocess the given dataset to separate the Transaction Date attribute into its base components to capture time-related fraud patterns. Also, Wasserstein Generative Adversarial Network (WGAN) is used to generate fraud transactions, which are then mixed with the base dataset to form a more balanced mixed dataset. These two datasets are used to train the AE and VAE models. We built two versions of the AE model (single-loss and multi-loss) besides a novel method of calculating the anomaly score threshold, called Recall-First Threshold (RFT), which helps enhance the model’s performance. Experimental results demonstrated that the False Positive Rate (FPR) drops down to as low as 7% in the proposed multi-loss AE model. In comparison, we achieved an accuracy of 93%, with 100% of the fraud transactions recalled successfully.

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Volume

9

First Page

83762

Last Page

83785

Disciplines

Computer Sciences

Keywords

Support vector machines, Clustering algorithms, Deep learning, Generative adversarial networks, Unsupervised learning, Radio frequency, Decision trees

Scopus ID

85107348389

Indexed in Scopus

yes

Open Access

yes

Open Access Type

Gold: This publication is openly available in an open access journal/series

Share

COinS