Hybrid Voting Classifier Model for COVID-19 Prediction by Embedding Machine Learning Techniques

Main Article Content

GurjotKour, Pawanesh Abrol, Namrata Kalrupia, Jasmeen Kaur


Predictive analytics, including methods of data mining, are typically used to improve predictability levels for results of interest or KPIs (Key Performance Indicators). This research work is based on the COVID-19 prediction which has various phases which include data pre-processing, feature reduction and classification. This paper presents a Hybrid Voting Classifier for prediction of corona infection. In this research work, dataset is collected from authentic data source which is pre-processed to remove missing and redundant values. The collected dataset contains incidences of Mexico COVID-19 cases. The dataset is further processed for the feature reduction using PCA algorithm and k-means algorithm is applied which can cluster similar and dissimilar features. In the last phase voting classifier is applied which is combination of naive Bayes, Random Forest Classifier, Bernoulli naive Bayes, and SVM for the COVID-19 prediction. The proposed model is examined in terms of parameters like accuracy, precision and recall. The performance results show that logistic regression givesan accuracy of 84%, naive Bayes has 82% and voting classification method generates maximum accuracy value of 94%. The recall value of logistic regression is 84%, naive Bayes gives 82% and voting classification gives maximum recall value of 94%. The precision value of logistic regression is 71%, naive Bayes gives 73% and voting classification results in the highest precision value of 94% for COVID-19 prediction. This study also depicts that how these supervised approach options may help to alleviate the enormous strain on the healthcare system's constrained capacity.

Article Details