American University of Bahrain
Browse

A Development of a Sentiment AnalysisModel for the Bahraini Dialects

Download (1.01 MB)
thesis
posted on 2024-02-19, 13:35 authored by Noor Khalifa

Sentiment analysis has become a widely researched topic with the exponential growth of data on the Web. Many industries, such as the financial, political, retail, and oil and gas industries use sentiment analysis to extract information about people’s opinion of a service, product, person, event, or organization. This information is later utilized to predict behaviours, forecast data, improve services, or simply understand people’s opinions about a topic. Numerous Arabic sentiment analysis models have been developed for many dialects using previously collected datasets or original ones. However, these models have been proven to have a low accuracy when predicting the sentiment of text written in the Bahraini dialects. This is due to the lack of Natural Language Processing (NLP) research curated for the Bahraini dialects and the lack of datasets that include Bahraini text. This project aims to create a labeled dataset of Bahraini text collected from public resources and to develop a sentiment analysis model specifically for the Bahraini dialects using the collected dataset. The dataset was created by scraping Instagram comments from various collected dataset. The dataset was created by scraping Instagram comments from various Bahraini accounts and was labeled by three volunteers. The models were developed using machine learning and deep learning algorithms. Many combinations of preprocessing techniques and features were investigated to get the most accurate model. The most accurate machine learning algorithm was Logistic Regression with TF-IDF, Unigrams, and Bigrams as features reaching an average of 70.05% accuracy. Additionally, the deep learning architecture that reached the highest accuracy consisted of two sequential LSTM layers with an accuracy of 64.74%. This first-of-its-kind project is a major contribution to the NLP research in Bahrain. The project’s findings a could make a notable impact on the on Bahraini businesses’ ability to understand their customers when utilizing the created models. Additionally, this outcome of this project can be used by other researchers as a benchmark or starting point to their research on NLP.

History

Place of Publication

Manama, Bahrain

Qualification Name

Bachelor of science in Computer Engineering

Qualification Level

  • Undergraduate

Supervisors

Dr. Hasan Kadhem

Usage metrics

    Capstone Projects & reports-BS

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC