Towards Highly-Efficient k-Nearest Neighbor Algorithm for Big Data Classification

Author First name, Last name, Institution

Hassan I. Abdalla, Zayed University
Ali A. Amer, Taiz University

Document Type

Conference Proceeding

Source of Publication

2022 5th International Conference on Networking, Information Systems and Security: Envisage Intelligent Systems in 5g//6G-based Interconnected Digital Worlds (NISS)

Publication Date

3-31-2022

Abstract

the k-nearest neighbors (kNN) algorithm is naturally used to search for the nearest neighbors of a test point in a feature space. A large number of works have been developed in the literature to accelerate the speed of data classification using kNN. In parallel with these works, we present a novel K-nearest neighbor variation with neighboring calculation property, called NCP-kNN. NCP-kNN comes to solve the search complexity of kNN as well as the issue of high-dimensional classification. In fact, these two problems cause an exponentially increasing level of complexity, particularly with big datasets and multiple k values. In NCP-kNN, every test point’s distance is checked with only a limited number of training points instead of the entire dataset. Experimental results on six small datasets, show that the performance of NCP-kNN is equivalent to that of standard kNN on small and big datasets, with NCP-kNN being highly efficient. Furthermore, surprisingly, results on big datasets demonstrate that NCP-kNN is not just faster than standard kNN but also significantly superior. The findings, on the whole, show that NCP-kNN is a promising technique as a highly-efficient kNN variation for big data classification.

ISBN

978-1-6654-5363-9

Publisher

IEEE

Volume

00

First Page

1

Last Page

5

Disciplines

Computer Sciences

Keywords

Training, Machine learning algorithms, Text categorization, Machine learning, Big Data, Complexity theory, Classification algorithms

Indexed in Scopus

no

Open Access

no

Share

COinS