Language Model-Based Approach for Multiclass Cyberbullying Detection

Author First name, Last name, Institution

Sanaa Kaddoura, Zayed University
Reem Nassar, Zayed University

Document Type

Conference Proceeding

Source of Publication

Web Information Systems Engineering (WISE 2024)

Publication Date

12-3-2024

Abstract

Cyberbullying, characterized by digital abuse such as harassment and doxing, has become prevalent on social media platforms, mainly targeting despised groups. Victims often endure severe psychological effects, including anxiety and strained interpersonal relationships, sometimes ending in tragic outcomes like suicide. To mitigate these issues, automated systems for detecting cyberbullying text are crucial. While recent methods have employed classical, deep learning, and transformer-based language models like BERT, there remains a gap in the literature regarding the comparative effective-ness of large language models in this domain. This study addresses this gap by evaluating the efficacy of large language models, specifically Mistral 7B and Llama3, against the transformer-based model BERT. The comparison encompasses binary and multiclass classification scenarios, assessing their performance in identifying cyberbullying content. The multiclass BERT model has outperformed the literature's large language and other benchmark models, achieving an F1 score of 83.67%. The BERT model was capable of classifying multiple classes effectively without being biased.

ISBN

978-981-96-0566-8, 978-981-96-0567-5

ISSN

0302-3349

Publisher

Springer Nature Singapore

Volume

15437

First Page

78

Last Page

89

Disciplines

Computer Sciences

Indexed in Scopus

no

Open Access

no

Share

COinS