ORCID Identifiers

0000-0002-2002-948X

Document Type

Article

Source of Publication

PeerJ Computer Science

Publication Date

7-29-2021

Abstract

In Information Retrieval (IR), Data Mining (DM), and Machine Learning (ML), similarity measures have been widely used for text clustering and classification. The similarity measure is the cornerstone upon which the performance of most DM and ML algorithms is completely dependent. Thus, till now, the endeavor in literature for an effective and efficient similarity measure is still immature. Some recently-proposed similarity measures were effective, but have a complex design and suffer from inefficiencies. This work, therefore, develops an effective and efficient similarity measure of a simplistic design for text-based applications. The measure developed in this work is driven by Boolean logic algebra basics (BLAB-SM), which aims at effectively reaching the desired accuracy at the fastest run time as compared to the recently developed state-of-the-art measures. Using the term frequency–inverse document frequency (TF-IDF) schema, the K-nearest neighbor (KNN), and the K-means clustering algorithm, a comprehensive evaluation is presented. The evaluation has been experimentally performed for BLAB-SM against seven similarity measures on two most-popular datasets, Reuters-21 and Web-KB. The experimental results illustrate that BLAB-SM is not only more efficient but also significantly more effective than state-of-the-art similarity measures on both classification and clustering tasks.

DOI Link

10.7717/peerj-cs.641

ISSN

2376-5992

Publisher

PeerJ

Volume

Disciplines

Computer Sciences

Scopus ID

85112800291

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Recommended Citation

Abdalla, Hassan I. and Amer, Ali A., "Boolean logic algebra driven similarity measure for text based applications" (2021). All Works. 4433.
https://zuscholars.zu.ac.ae/works/4433

Indexed in Scopus

yes

Open Access

yes

Open Access Type

Gold: This publication is openly available in an open access journal/series

Download

Included in

Computer Sciences Commons

COinS

All Works

Boolean logic algebra driven similarity measure for text based applications

ORCID Identifiers

Document Type

Source of Publication

Publication Date

Abstract

DOI Link

ISSN

Publisher

Volume

Disciplines

Scopus ID

Creative Commons License

Recommended Citation

Indexed in Scopus

Open Access

Open Access Type

Included in

Search

Browse

Contribute

Content Type

All Works

Boolean logic algebra driven similarity measure for text based applications

Author First name, Last name, Institution

ORCID Identifiers

Document Type

Source of Publication

Publication Date

Abstract

DOI Link

ISSN

Publisher

Volume

Disciplines

Scopus ID

Creative Commons License

Recommended Citation

Indexed in Scopus

Open Access

Open Access Type

Included in

Share

Search

Browse

Contribute

Content Type