Enhancing Sentiment Analysis of Movie Reviews with PySpark

Document Type

Conference Proceeding

Source of Publication

2024 2nd International Conference on Self Sustainable Artificial Intelligence Systems (ICSSAS)

Publication Date

10-25-2024

Abstract

Sentiment analysis is pivotal in the film industry as it provides insights into audience opinions, aids in recommending movies, and forecasts box office performance. This research utilizes Apache Spark’s PySpark framework to enhance sentiment analysis in movie reviews. Traditional sentiment analysis techniques often struggle with the large and complex nature of textual data in movie reviews. PySpark’s distributed computing capabilities enable efficient processing and analysis of extensive datasets. We employ logistic regression for classification and validate our model using standard metrics. Our method shows considerable improvements in processing speed and scalability, offering valuable insights into public sentiment and its potential effects on box office outcomes.

ISBN

979-8-3503-6841-3

Publisher

IEEE

Volume

00

First Page

1255

Last Page

1260

Disciplines

Computer Sciences

Keywords

Sentiment analysis, Movie reviews, PySpark, Logistic regression, Box office performance

Indexed in Scopus

no

Open Access

no

Share

COinS