Main Article Content
Thisstudy attempts to design a two-step approach for Afaan Oromoo text sentiment classification model, clustering followed by classification algorithms. A total of 1597 data which is collected from Oromia Broadcasting Corporate (OBN)“8331SMS database”from three domains (i.e. news, entertainment and general service domain) is used to conduct the experiment. First, text preprocessing is undertaken so as to clean data and prepare raw data for further processing. Then, clustering algorithm is used to find natural grouping of the unlabeled Afaan Oromoo text opinion documents. K-means and Gaussian Mixture (GMM) clustering algorithms were tested. GMM performs better and is selected to obtain the specific group of Afaan Oromoo documents. The result obtained from the clustering algorithm is directly exported to a CSV file and prepared for classification tasks. Three supervised learning algorithms, includingNaïve Bayes (NB), Support Vector Machine (SVM) and K-Nearest Neighbor (KNN) in each domain are used to classify the sentiment of short Afaan Oromoo text. The result shows that SVM out performs NB and KNN with an accuracy of 91.66%, 93.76% and 92.87% for news, entertainment and general service domain respectively. This is a promising result to design sentiment analysis for comments given in Afaan Oromoo.