Habberrih, Abdullah and Ali Abuzaraida, Mustafa (2024) Sentiment Analysis of Libyan Dialect Using Machine Learning with Stemming and Stop-words Removal. In: 5TH INTERNATIONAL CONFERENCE ON COMMUNICATION ENGINEERING AND COMPUTER SCIENCE (CIC-COCOS'24), 24-25/04/2024, Cihan University-Erbil.
Conf_COCOS24_06-07-2024...pdf - Published Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Download (453kB)
Abstract
This study evaluates the impact of using Stemming and Stop-words removal techniques on machine learning classifiers, Support Vector Machine (SVM) and Logistic Regression (LR), in detecting sentiment from Libyan dialect poetry. The lack of Arabic Natural Language Process resources has made sentiment analysis for Arabic a challenging task compared to other languages. A secondary dataset was used and two experiments were conducted, with the first exploring the use of Stemming with Stop-words removal techniques and the second investigating the impact of using Stemming alone. Other preprocessing techniques were applied alongside TF-IDF with a combination of Unigrams and Trigrams during feature extraction. The results show that the Stop-words removal technique may have a negative impact on classifier performance. SVM outperformed LR in both experiments, achieving an accuracy of 71.63%, while LR achieved 70.92% in the second experiment. This study's accuracy outperformed previous research on the topic, achieving 71.63%, compared to 69% in earlier studies.
| Item Type: | Conference or Workshop Item (Paper) |
|---|---|
| Uncontrolled Keywords: | Sentiment Analysis, Arabic Dialects, Machine Learning, TF-IDF, Stemming. |
| Subjects: | Q Science > QA Mathematics > QA76 Computer software |
| Divisions: | Conferences > CIC-COCOS |
| Depositing User: | ePrints Depositor |
| Date Deposited: | 13 Apr 2025 18:57 |
| Last Modified: | 13 Apr 2025 18:57 |
| URI: | https://eprints.cihanuniversity.edu.iq/id/eprint/3137 |
