The construction of a new Lexicon design for Arabic language

Document Type

Conference Proceeding

Source of Publication

Business Transformation through Innovation and Knowledge Management: An Academic Perspective - Proceedings of the 14th International Business Information Management Association Conference, IBIMA 2010

Publication Date

1-1-2010

Abstract

Analyzing Arabic sentences is a difficult task; the difficulties come from several sources. One is that sentences are long and complex, the other difficulties come from the sentence structure. The syntactic structure of sentence parts may be missing, taking into accounts different orders of words and phrases. This paper aims to develop and assess an Arabic Lexicon. The new automatic Lexicon was developed with the purpose of analyzing and extracting the attributes of Arabic words. The lexicon was implemented using two-step process, tokenization and part of speech tagging. The output of the lexicon can be processed by another parser tool which perform an analysis on Arabic sentence to determines if the sentence follows a valid grammatical structure. An evaluation test was conducted to assess the effectiveness and efficiency of the new lexicon design using real sentences taken randomly. The results have shown a minimum accuracy rate of 92% which is considered highly satisfactory. The newly designed lexicon can be widely used for any application that requires Arabic Language analysis and processing.

ISBN

9780982148938

Publisher

International Business Information Management Association, IBIMA

Volume

3

First Page

2086

Last Page

2096

Disciplines

Computer Sciences

Keywords

Arabic language, Lexicon, Natural language processing, Parser

Scopus ID

84905091526

Indexed in Scopus

yes

Open Access

no

This document is currently not available here.

Share

COinS