AFND: Arabic fake news dataset for the detection and classification of articles credibility
Source of Publication
Data in Brief
The news credibility detection task has started to gain more attention recently due to the rapid increase of news on different social media platforms. This article provides a large, labeled, and diverse Arabic Fake News Dataset (AFND) that is collected from public Arabic news websites. This dataset enables the research community to use supervised and unsupervised machine learning algorithms to classify the credibility of Arabic news articles. AFND consists of 606912 public news articles that were scraped from 134 public news websites of 19 different Arab countries over a 6-month period using Python scripts. The Arabic fact-check platform, Misbar, is used manually to classify each public news source into credible, not credible, or undecided. Weak supervision is applied to label news articles with the same label as the public source. AFND is imbalanced in the number of articles in each class. Hence, it is useful for researchers who focus on finding solutions for imbalanced datasets. The dataset is available in JSON format and can be accessed from Mendeley Data repository.
Communication | Computer Sciences
Arabic news dataset, Arabic fake news, Article credibility, Weak labeling, Detection, Classification
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Khalil, Ashwaq; Jarrah, Moath; Aldwairi, Monther; and Jaradat, Manar, "AFND: Arabic fake news dataset for the detection and classification of articles credibility" (2022). All Works. 4999.
Indexed in Scopus
Open Access Type
Gold: This publication is openly available in an open access journal/series