Evaluating Automatic Annotation Techniques for Fine-Tuning Large Language Models in Financial Sentiment Analysis

Document Type

Conference Proceeding

Source of Publication

4th International Conference on Sentiment Analysis and Deep Learning, ICSADL 2025 - Proceedings

Publication Date

1-1-2025

Abstract

Accurate sentiment analysis in financial contexts is crucial for market analysis and investment decisions. However, manually annotating large datasets for training models is time-consuming and costly. This study evaluates two automatic an-notation tools, VADER and TextBlob, for their effectiveness in generating labels for training large language models (LLMs) like BERT and GPT-2 in financial sentiment analysis. We used a large dataset of stock market-related tweets, including a manually annotated subset for benchmarking. The remaining tweets were automatically labeled using VADER and TextBlob. We trained BERT and GPT-2 models with these sentiment labels and compared their performance against the benchmark dataset. Our findings show that models trained with VADER annotations had a higher correlation with human-labeled data (62% accuracy) compared to those trained with TextBlob annotations (48% accuracy). These results suggest that VADER is more suitable for automatic annotation in financial sentiment analysis, providing more accurate and reliable sentiment labels, which can significantly improve the efficiency of training large language models for financial applications.

ISBN

[9798331523923]

Publisher

IEEE

First Page

215

Last Page

220

Disciplines

Computer Sciences

Keywords

BERT, Financial domain, GPT-2, Sentiment Analysis, VADER and TextBlob

Scopus ID

05002445890

Indexed in Scopus

yes

Open Access

no

Share

COinS