All Works

On the Integration of Similarity Measures with Machine Learning Models to Enhance Text Classification Performance

Document Type

Article

Source of Publication

Information Sciences

Publication Date

10-1-2022

Abstract

Several techniques have long been proposed to enhance text classification performance, such as: classifier ensembles, feature selection, the integration of similarity measures with classifiers, and meta-heuristic algorithms. The integration of similarity measures with machine learning models (ML), however, has not yet received thorough analysis for text classification. As a result, in an effort to thoroughly investigate the impact of similarity measures integration with ML models, this work makes three major contributions: (1) proposing newly-integrated models and presenting benchmarking studies for integration methodology over balanced/imbalanced datasets; (2) offering detailed analysis for dozens of integrated models that are established, and experimentally proven, to significantly outperform state-of-the-art performance. The models' construction used fourteen similarity measures, three knowledge representations (BoW, TFIDF, and Word embedding), and five models (Support Vector Machine, N-Centroid-based Classifier, Multinomial Naïve Bayesian, Convolutional Neural Network, and Artificial Neural Network); and (3) introducing significantly-effective and highly-efficient variations of these five models. The evaluation study has been conducted internally for integrated models against their baselines, and externally against the state-of-the-art models. While the internal evaluation constantly showed a total enhancement rate of 49.3% and 59% over the balanced and imbalanced datasets, respectively, the external evaluation attested to the superiority of the integrated models.

DOI Link

10.1016/j.ins.2022.10.004

ISSN

0020-0255

Publisher

Elsevier BV

Disciplines

Computer Sciences

Recommended Citation

Abdalla, Hassan I. and Amer, Ali A., "On the Integration of Similarity Measures with Machine Learning Models to Enhance Text Classification Performance" (2022). All Works. 5394.
https://zuscholars.zu.ac.ae/works/5394

Indexed in Scopus

Open Access

Link to Full Text

COinS

All Works

On the Integration of Similarity Measures with Machine Learning Models to Enhance Text Classification Performance

Document Type

Source of Publication

Publication Date

Abstract

DOI Link

ISSN

Publisher

Disciplines

Recommended Citation

Indexed in Scopus

Open Access

Search

Browse

Contribute

Content Type

All Works

On the Integration of Similarity Measures with Machine Learning Models to Enhance Text Classification Performance

Author First name, Last name, Institution

Document Type

Source of Publication

Publication Date

Abstract

DOI Link

ISSN

Publisher

Disciplines

Recommended Citation

Indexed in Scopus

Open Access

Share

Search

Browse

Contribute

Content Type