Supervised Learning Techniques for Classification Of Students’ Tweets
Main Article Content
Abstract
In today’s era, up-to-date information can be retrieved from social network, internet community and data forums. People especially the younger generation share their feelings, happiness, experience and also day to day happenings in the social media platforms like Twitter. There exists large volume of unstructured data in it. The proposed system concentrates on the learning process of the engineering students and the problems faced by them during their study from their twitter posts. Since the data collected is huge, Apache hadoop map reduce environment is used for processing. The system includes pre-processing of tweets, calculating F1 measure, identifying prominent categories, identifying word and category probability and finally classifies tweets to the respective categories. The supervised learning techniques such as multiclass SVM based Platt Scaling, Naïve Bayes and logistic regression are used to identify heavy study load, lack of social engagement and sleep problems. Comparing the results attained, SVM achieves an accuracy score of 84% which is 5 to 10 percent higher than Logistic Regression and Naïve Bayesian method
Downloads
Metrics
Article Details
Licensing
TURCOMAT publishes articles under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This licensing allows for any use of the work, provided the original author(s) and source are credited, thereby facilitating the free exchange and use of research for the advancement of knowledge.
Detailed Licensing Terms
Attribution (BY): Users must give appropriate credit, provide a link to the license, and indicate if changes were made. Users may do so in any reasonable manner, but not in any way that suggests the licensor endorses them or their use.
No Additional Restrictions: Users may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.