Semi Supervised Multi Text Classifications for Telugu Documents

D Naga Sudha,  et. al.

doi:10.17762/turcomat.v12i12.7430

PDF

Published: May 23, 2021

DOI: https://doi.org/10.17762/turcomat.v12i12.7430

D Naga Sudha, et. al.

Abstract

As the amount of information available on the internet grows at a rapid pace, text classification becomes critical. This data is in an unstructured state and will need to be digitized. Due to the digital nature of these documents, data must be organized by automatically assigning a collection of documents to predefined labels based on their content. To mitigate the growing impact of news text classification, keyword detection approaches based on mostly supervised classification methods have been proposed. However, in practice, the given data is largely unlabeled, necessitating the use of semi-supervised learning techniques instead. We examine the effectiveness of a semi-supervised method for Telugu news articles in this paper. It also addresses some of the most pressing issues in automated text classification, including dealing with unstructured text, handling large numbers of attributes using natural language processing techniques, and dealing with missing metadata due to Telugu's morphological complexity. After classification, semi-supervised clustering is used to classify patterns. Unlabeled texts are used to adapt the centroids, while unlabeled texts are used to capture text cluster silhouette coefficients. To that end, the aim of this study is to use semi-supervised learning methods to investigate the effect of n-gram feature selection on news article text classification. Statistical results classification rate, precision, recall and F-score for news articles are validated. The results show that the approaches significantly improve the performance.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

How to Cite

et. al., D. N. S. . (2021). Semi Supervised Multi Text Classifications for Telugu Documents. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 12(12), 644–648. https://doi.org/10.17762/turcomat.v12i12.7430

Issue

Vol. 12 No. 12 (2021)

Section

Research Articles

You are free to:

Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
Adapt — remix, transform, and build upon the material for any purpose, even commercially.
The licensor cannot revoke these freedoms as long as you follow the license terms.

Under the following terms:

Attribution — You must give appropriate credit , provide a link to the license, and indicate if changes were made . You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

Notices:

You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation .

No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.

Article Sidebar

Main Article Content