Fake News Detection in Low-Resourced Languages “Kurdish Language” Using Machine Learning Algorithms

Main Article Content

Rania Azad, Bilal Mohammed, RawazMahmud, LanyaZrara, ShajwanSdiq


With the growth of using the internet and the large amount of real-time information created and shared over social
media platforms, the risk of disseminating malicious activities, perform illegal movements, abuse other people, and publicize
fake news increased dramatically. Fake news detection is a well-studied research issue to understand the nature of fake news,
detection or prevention for the highly resourced languages like Arabic, English, and other European languages where lessresourced
languages remain out of the focus because of the absence of labeled fake corpus, absence of fact-checker websites or
unavailability of NPL tools, until today, non-research has been conducted in Fake news detection in the Kurdish language. This
paper showcase creating a novel Kurdish Fake news corpus that made publicly available1,it contains two sets of news, the first
one contains crawled fake news, the second set contains manipulated text from real news, then several classifiers applied on the
corpus after using TF-IDF as a feature of selection. The outcome of the proposed paper showed that Support Vector Machine
(SVM) scored the highest accuracy 88.71% among the other classifiers on set 1 and LR outperforms the other algorithms on set
2. This work can be considered as a baseline for future studies.

Article Details