All Works

ZAEBUC design and annotation: Guidelines, processes, and insights

Nizar Habash, NYU Abu Dhabi
David M. Palfreyman, Zayed University

Document Type

Book Chapter

Source of Publication

Bilingual Writers and Corpus Analysis

Publication Date

12-23-2022

Abstract

In this chapter, we present the ZAEBUC corpus annotations used by the remaining chapters in this book. In addition to rich metadata for all the texts in ZAEBUC, we discuss the various guidelines and pipeline processes we followed to create the annotations and quality check them. The annotations include spelling and grammar correction, morphological tokenization, Part-of-Speech tagging, lemmatization, and Common European Framework of Reference (CEFR) ratings. All of the annotations are done on both Arabic and English texts using consistent guidelines as much as possible. We also tracked the alignments within the different annotations, and with the original raw texts. For all annotations, we use existing automatic annotation tools followed by manual correction, except for CEFR ratings, which are only manual. We also present various measurements and correlations with preliminary insights drawn from the data and annotations. The ZAEBUC corpus annotations are intended to be the stepping stones for additional annotations. Some of the book chapters use the annotations directly, and some extend them through additional manual and automatic annotations.

DOI Link

10.4324/9781003183921-2

ISBN

9781000782660,9781003183921

Publisher

Routledge

First Page

Last Page

Disciplines

Education | Linguistics

Scopus ID

85143670981

Recommended Citation

Habash, Nizar and Palfreyman, David M., "ZAEBUC design and annotation: Guidelines, processes, and insights" (2022). All Works. 5591.
https://zuscholars.zu.ac.ae/works/5591

Indexed in Scopus

yes

Open Access

Link to Full Text

COinS

All Works

ZAEBUC design and annotation: Guidelines, processes, and insights

Document Type

Source of Publication

Publication Date

Abstract

DOI Link

ISBN

Publisher

First Page

Last Page

Disciplines

Scopus ID

Recommended Citation

Indexed in Scopus

Open Access

Search

Browse

Contribute

Content Type

All Works

ZAEBUC design and annotation: Guidelines, processes, and insights

Author First name, Last name, Institution

Document Type

Source of Publication

Publication Date

Abstract

DOI Link

ISBN

Publisher

First Page

Last Page

Disciplines

Scopus ID

Recommended Citation

Indexed in Scopus

Open Access

Share

Search

Browse

Contribute

Content Type