An Innovative Automatic Indexing Method For Arabic Text

Document Type

Article

Source of Publication

Advances in Computing & Engineering

Publication Date

5-20-2023

Abstract

The study of automatic indexing and text retrieval methods for language has a long history. Automatic indexing involves extracting words from a document to categorize it based on subject matter and to improve the information retrieval process. Despite extensive research in other languages, there remains limited investigation into automated Arabic text categorization. In this research, the researchers introduce an innovative method to enhance the accuracy of automatic indexing of Arabic texts by incorporating a thesaurus. Their approach extracts new relevant words by referencing thesaurus, which contains words, synonyms, and correlations identified through its construction using a natural language toolkit and a WordNet library. Synonyms with similar meanings that frequently appear together are grouped using a JavaScript Object Notation dictionary. The research results demonstrate a significant improvement in accuracy and efficiency compared to prior studies.

ISSN

2735-5985

Volume

3

Issue

1

Disciplines

Computer Sciences

Keywords

Arabic Text, Automatic Indexing, Building Thesaurus, Frequent Sets

Indexed in Scopus

no

Open Access

no

Share

COinS