Towards Highly-Efficient k-Nearest Neighbor Algorithm for Big Data Classification

Document Type

Conference Proceeding

Source of Publication

2022 5th International Conference on Networking, Information Systems and Security: Envisage Intelligent Systems in 5g//6G-based Interconnected Digital Worlds (NISS)

Publication Date

3-31-2022

Abstract

the k-nearest neighbors (kNN) algorithm is naturally used to search for the nearest neighbors of a test point in a feature space. A large number of works have been developed in the literature to accelerate the speed of data classification using kNN. In parallel with these works, we present a novel K-nearest neighbor variation with neighboring calculation property, called NCP-kNN. NCP-kNN comes to solve the search complexity of kNN as well as the issue of high-dimensional classification. In fact, these two problems cause an exponentially increasing level of complexity, particularly with big datasets and multiple k values. In NCP-kNN, every test point’s distance is checked with only a limited number of training points instead of the entire dataset. Experimental results on six small datasets, show that the performance of NCP-kNN is equivalent to that of standard kNN on small and big datasets, with NCP-kNN being highly efficient. Furthermore, surprisingly, results on big datasets demonstrate that NCP-kNN is not just faster than standard kNN but also significantly superior. The findings, on the whole, show that NCP-kNN is a promising technique as a highly-efficient kNN variation for big data classification.

ISBN

978-1-6654-5363-9

Publisher

IEEE

Volume

00

First Page

1

Last Page

5

Disciplines

Computer Sciences

Keywords

Training, Machine learning algorithms, Text categorization, Machine learning, Big Data, Complexity theory, Classification algorithms

Indexed in Scopus

no

Open Access

no

Share

COinS