Impact of K-Means and DBSCAN Clustering on Supervised Learning for Heart Disease Prediction

Main Article Content

Pulugu Dileep, et. al.


Cardiovascular diseases (CVDs) are the main cause of death of around 17.9 million across the globe every year. Different heart ailments or conditions exist that lead to death. Early detection of heart disease can help in preventing death rate. Data driven approaches with Artificial Intelligence (AI) innovations are being used for Clinical Decision Support System (CDSS). From the literature, it is found that most of the algorithms are based on supervised learning that need training set for learning process. Often supervised learning have revealed limitations as they wholly depend on the quality of training data. Feature selection has been around for leveraging supervised machine learning techniques. However, there is inadequate research on the unsupervised learning methods such as clustering. Nevertheless, they around round to group objects with similarities and thus have inherent knowledge that can add value to features extracted. Provided this fact, in this research, we proposed a framework known as Hybrid Machine Learning for Heart Disease Prediction (HML-HDP). The framework has provision for unsupervised learning followed by supervised learning. In other words, the unsupervised learning could lead to features that complement feature selection method for improving accuracy of heart disease prediction. A prototype application is built in order to evaluate the framework. The empirical results are compared with the methods in state of the art. SVM with DBSCAN showed Highest performance with precision 0.96, recall 0.93 and F-Measure 0.9447. The results revealed that the proposed hybrid framework performs better than existing methods.

Article Details