Source of Publication
With time, numerous online communication platforms have emerged that allow people to express themselves, increasing the dissemination of toxic languages, such as racism, sexual harassment, and other negative behaviors that are not accepted in polite society. As a result, toxic language identification in online communication has emerged as a critical application of natural language processing. Numerous academic and industrial researchers have recently researched toxic language identification using machine learning algorithms. However, Nontoxic comments, including particular identification descriptors, such as Muslim, Jewish, White, and Black, were assigned unrealistically high toxicity ratings in several machine learning models. This research analyzes and compares modern deep learning algorithms for multilabel toxic comments classification. We explore two scenarios: the first is a multilabel classification of Religious toxic comments, and the second is a multilabel classification of race or toxic ethnicity comments with various word embeddings (GloVe, Word2vec, and FastText) without word embeddings using an ordinary embedding layer. Experiments show that the CNN model produced the best results for classifying multilabel toxic comments in both scenarios. We compared the outcomes of these modern deep learning model performances in terms of multilabel evaluation metrics.
Springer Science and Business Media LLC
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Abbasi, Ahmed; Javed, Abdul Rehman; Iqbal, Farkhund; Kryvinska, Natalia; and Jalil, Zunera, "Deep learning for religious and continent-based toxic content detection and classification" (2022). All Works. 5415.
Indexed in Scopus
Open Access Type
Gold: This publication is openly available in an open access journal/series