ANTi-Vax: A Novel Twitter Dataset for COVID-19 Vaccine Misinformation Detection

Document Type

Article

Source of Publication

Public Health

Publication Date

12-7-2021

Abstract

Objectives COVID-19 (SARS-CoV-2) pandemic has infected hundreds of millions and inflicted millions of deaths around the globe. Fortunately, the introduction of COVID vaccines provided a glimmer of hope and a pathway to recovery. However, due to misinformation being spread on social media and other platforms, there has been a rise in vaccine hesitancy which can lead to a negative impact on vaccine uptake in the population. The goal of this research is to introduce a novel machine learning-based COVID-19 vaccine misinformation detection framework. Study Design We collected and annotated COVID-19 vaccine tweets and trained machine learning algorithms to classify vaccine misinformation. Methods More than 15,000 tweets were annotated as misinformation or general vaccine tweets using reliable sources and validated by medical experts. The classification models explored were XGBoost, LSTM, and BERT transformer model. Results The best classification performance was obtained using BERT, resulting in 0.98 F1-score on the test set. The precision and recall scores were 0.97 and 0.98 respectively. Conclusion Machine learning-based models are effective in detecting misinformation regarding COVID-19 vaccines on social media platforms.

Publisher

Elsevier

Disciplines

Communication | Computer Sciences | Medicine and Health Sciences

Keywords

COVID-19, Vaccines, Text classification, Misinformation detection, Deep learning, Natural language processing

Scopus ID

85122516994

Indexed in Scopus

yes

Open Access

yes

Open Access Type

Bronze: This publication is openly available on the publisher’s website but without an open license

Share

COinS