Enhancing data classification using locally informed weighted k-nearest neighbor algorithm
Document Type
Article
Source of Publication
Expert Systems with Applications
Publication Date
6-1-2025
Abstract
In this work, a novel locally informed weighted kNN algorithm (LIWkNN) is presented to reduce the detrimental impact of outliers and an imbalanced class. The LIWkNN considers the labels of both the query and its neighboring data points, emphasizing its focus on the vicinity of the query point, enabling it to capture local patterns and variations. The algorithm updates the weights assigned to the neighbors by comparing their labels, which are subsequently utilized in the next step to predict the label for the query. Initially, all training point weights are set to 1. Secondly, predictions are made using the conventional KNN classifier, and then it is verified that the prediction matches the query label in the test data. These weights will be updated if the predicted label differs from the actual query's label, which otherwise will not be changed. According to the weight update process, an outlier's influence on the classification in the weighted KNN is kept to the minimum extent during the classification process, specifically if it is frequently selected as a neighbor for various queries. Thirdly, to address class imbalance, this method adjusts the weighting based on class density, ensuring that minority class points predominantly receive neighbors from their own class. Finally, once this weight update process is complete, the proposed KNN will be working with the final weights to classify the test points. The LIWkNN's competitive performance and straightforward architecture demonstrate the model's novelty, setting it apart from its cutting-edge competitors. To validate the LIWkNN's generalizability on a broader range of datasets, a comprehensive assessment using five evaluation measures (accuracy, F1-measure, ROC, mean absolute error—MAE, and geometric mean—GM) across sixty (balanced, imbalanced, noisy, time-series, and images) datasets is carried out in six experimental phases. According to the results supported with a multi-criteria analysis, LIWkNN is significantly more promising over the vast majority of all datasets taken into consideration, both generally and for specific k values.
DOI Link
ISSN
Publisher
Elsevier BV
Volume
276
Disciplines
Computer Sciences
Keywords
Artificial intelligence, Data classification, Data mining, k-nearest neighbor, kNN, Machine learning
Scopus ID
Recommended Citation
Abdalla, Hassan I. and Amer, Ali A., "Enhancing data classification using locally informed weighted k-nearest neighbor algorithm" (2025). All Works. 7158.
https://zuscholars.zu.ac.ae/works/7158
Indexed in Scopus
yes
Open Access
no