All Works

The Impact of Data Normalization on KNN Rendering

Hassan I. Abdalla, Zayed University
Aneela Altaf, Zayed University

Document Type

Book Chapter

Source of Publication

Lecture Notes on Data Engineering and Communications Technologies

Publication Date

9-18-2023

Abstract

Data normalization is a vital preprocessing technique in which the data is either scaled or converted so features will make an equal contribution. The success of classifiers, like K-Nearest Algorithm, is highly dependent on data quality to generalize classification models. In its turn, KNN is the simplest and most widely-used model for different machine learning-based tasks, including text classification, pattern recognition, plagiarism and intrusion detection, ranking models, sentiment analysis, etc. While the core of KNN is basically based on similarity measures, its performance is also highly contingent on the nature and representation of data. It is commonly known in literature that to secure competitive performance with KNN, data must be normalized. This raises a key question about which normalization method would lead to the best performance. To answer this question, the normalization of data with KNN, which has not yet been given good attention, is investigated in this work. We provide a comparative study on the significant impact of data normalization on KNN performance using six normalization methods, namely, Decimal, L2-Norm, Max/Min, Std Norm, TFIDF and BoW. On eight publicly-available datasets, experimental results show that no method dominates the others. However, the L2-Norm, Decimal, and TFIDF methods were shown to obtain the best performance (measured by accuracy, precision, and recall) in most evaluation metrics. Moreover, run time analysis shows that KNN is working efficiently with BoW, followed by TFIDF.

DOI Link

10.1007/978-3-031-43247-7_16

ISBN

978-3-031-43246-0, 978-3-031-43247-7

ISSN

2367-4520

Publisher

Springer Nature Switzerland

Volume

184

First Page

176

Last Page

184

Disciplines

Computer Sciences

Keywords

KNN, Normalization, Text Classification, Machine Learning, Performance Evaluation

Recommended Citation

Abdalla, Hassan I. and Altaf, Aneela, "The Impact of Data Normalization on KNN Rendering" (2023). All Works. 6047.
https://zuscholars.zu.ac.ae/works/6047

Indexed in Scopus

Open Access

Link to Full Text

COinS

All Works

The Impact of Data Normalization on KNN Rendering

Document Type

Source of Publication

Publication Date

Abstract

DOI Link

ISBN

ISSN

Publisher

Volume

First Page

Last Page

Disciplines

Keywords

Recommended Citation

Indexed in Scopus

Open Access

Search

Browse

Contribute

Content Type

All Works

The Impact of Data Normalization on KNN Rendering

Author First name, Last name, Institution

Document Type

Source of Publication

Publication Date

Abstract

DOI Link

ISBN

ISSN

Publisher

Volume

First Page

Last Page

Disciplines

Keywords

Recommended Citation

Indexed in Scopus

Open Access

Share

Search

Browse

Contribute

Content Type