Myocardial Infarction Prediction Using Hybrid Machine Learning Techniques

The myocardial infarction prediction is an important task in health care domain in the current days. So, Prediction of cardiovascular diseases is a critical challenge in the area of clinical data analysis. It is difficult to predict myocardial infarction prediction by physicians with huge health records. To overcome this complexity we need to implement the automatic heard disease prediction system to notify the patient and get to recovery from the disease. Here to gaining the automatic system we are using machine learning techniques to easily performing myocardial infarction prediction. The machine learning techniques can be split into multiple types like unsupervised and supervised learning classifier. The supervised learning techniques working with structured data which is recommended to implement this classifiers. So, in this system we are using supervised learning techniques namely KNN, RF, NN, DT, NB, and SVM classifiers. To predict myocardial infarction, this system is using training dataset which is accessing from UCI ML repository. As well as this system is comparing accuracy performance between various machine learning algorithms and accuracy results with graphical presentation. This makes the accessing of the risk of the disease in the early stages and can try to save the patient without having any loss.


Introduction
Day by data the huge of health records are raises in healthcare medical industry. So, there is a recommended to manage a data and make them as useful information for decision making. Due to this problem, healthcare industry wants to implement an automatic system technique which will deliver productive decision from a huge dataset. So, the ML techniques are effective of resolving these kinds of issues very well. Because it can provide effectible methods to retrieve meaningful information without analyze the huge database. In the medical industry, the important data can be gathered from various patient"s manifestations and clinical reports for analysis by physicians. These days at stage of lifetime lot of people are getting heart failure symptoms. But comparing between old people and young people, the senior citizens are facing this type of problems. However, the machine learning techniques can find correlations between different features for predicting myocardial infarction status from training dataset. By using this kind of training models, it can detect the myocardial infarction patients without help of medical Practitioners. Then it can pretend as an automatic system to categorize between positive myocardial infarction patients and negative myocardial infarction patients with accurately, so then it reduces the diagnosis time and cost of treatment.
In the medical health care domain, providing qualities services and predict the diagnosis status accurately is a main challenge task. According survey a lot of people passed away with myocardial infarction even managed and controlled effectively by automatic system. Here any disease can be controlled by dependents of detection of that disease at right time. So in this system the proposed system can predict the heat disease status at advance stage to notify the patients and help them to recovery from that disease. The huge of medical records are generated by medical experts for analyze and retrieve the useful information form that database. The health care database contains mostly unattached information which is tedious task for prediction of myocardial infarction. So, in the health care domain if we implement the machine learning techniques which is understand structured information to prediction of various diseases. Therefore, this system proposed a automatic system to physicians for prediction of myocardial infarction at advance stage then they can provide treatment to patient and save them from rugged seriousness. So, the machine learning techniques have an important role in myocardial infarction prediction with supervised classifiers at advance stage to diagnoses the patients.

Related Work
The detection or prediction of myocardial infarction is toughest task in the health care domain. The physicians can detect myocardial infarction with some symptoms such as smoking, high in taking of fat and consuming alcohol etc. But with this symptoms the physicians can detect disease may or may not be accurately. Due to this reasons doctors cannot be do treatment to patients at early stages then there is a chance to face the harmful outcomes by patients. So, to detect or prediction myocardial infarction we need to develop the tool for detection of any disease at early stages then the physicians will do the treatment to patients to preventing harmful consequences. Here the prediction tool can be implementing by the supervised machine learning techniques to myocardial infarction prediction. Many clinical or hospitals generate huge medical records which is unstructured format. So by using this prediction tool can easily fetch the useful information to make the training dataset for further references. The machine leaning algorithms will take this training dataset as input and predict the myocardial infarction status with current patient details as testing dataset.
In this system we are proposing myocardial infarction prediction with six machine learning classification algorithms and accuracy results between these six algorithms. Here the main task of this system is predict the accurately when patient feels the pain with heart disease. Regarding implementation with help of heart disease training dataset and machine learning classifiers we build the train model file and then the physicians can enter the input values which is get from patients health records and giving to the training model as input for myocardial infarction prediction. The heart disease training dataset is downloading from UCI ML repository which is uploaded by the medical department experts. To build this system we had chosen python language which is has pre-trained libraries or packages to access the machine learning classifiers. Here all six algorithms providing best accuracies to prediction of myocardial infarction.

Implementation
System Model: Figure.1 System Architecture.
The figure.1 depicts about our proposed system model. In this system model they used heart disease training dataset which is downloaded from UCI repository. Later by preprocessing, it can read the training dataset and split the independent and dependent attributes by feature extraction and then build the training model with classification algorithm for myocardial infarction prediction by giving input data, finally calculate the accuracy between six machine learning classifiers.

Dataset Collection
In this system we are using UCI heart disease dataset shown in figure.2 which is accessing from Kaggle web repository (https://www.kaggle.com/ronitf/heart-disease-uci?select=heart.csv). This training dataset contain 14 attributes or features which are defined in Table.1 as well as it contains 303 records among them 164 records belong to NEGATIVE and 139 records belong to POSITIVE classes or targets.

Preprocessing:
In the preprocessing, we primarily load or read the training dataset with help of pandas library and by importing the pandas library we can invoke read_csv () method for read the entire dataset and store in a variable. The below snippet can show the training dataset loading processes.

Feature Extraction:
After completion of preprocessing such as loading the training dataset, we need to get features of given dataset. By using feature extraction method this system can separate the input and output attributes and both are storing in x_train and y_train variable respectively. The below snippet will shows that syntax.
From the above snippet using drop () function we can remove the target column or class column to save the only input attribute values and convert it into array format by using numpy module as well as for only storing the output variable values or target column values will be retrieve from data frames.
Here we need to import numpy module from array conversion.

DT:
The DT classifier is supervised machine learning technique, in this classifier it has one root and multiple nodes and finished with leaf nodes. Here the DT classifier can prepare the dataset in the tree form. Later when the user enter the testing dataset for myocardial infarction prediction then it follows the IF and THEN rules which means it compares with every node, until it reaches to leaf node. The leaf nodes contain the target column values such as POSITIVE or NEGATIVE. Finally, the testing dataset value matches the final leaf node value that can be the system predictable values. In this system, we are using sklearn.tree package to import the DecisionTreeClassifier to build training model for heart disease prediction. The below snippet will shows the building of DT classifier model.

RF:
The RF classifier is a collection of decision tress. It is also belongs to supervised machine learning algorithm. Here the RF classifier will gathered number of decision tress randomly to taking the decision. While prediction of disease it takes the all output values of decision trees randomly and which class is voted more than other class then that become the system predictable output status. In this system RF classifier providing 98% best accuracy compare with remaining algorithms. This system will use sklearn.ensemble package to import the RandomForestClassifier to build training model for heart disease prediction. The below syntax will shows the preparation of build model.

NN:
The neural network classifier is advance of all classifiers, because it was follow the brain neurons working process. The NN classifier takes the three layers such as input layer, hidden layer and output layer. Here with the primary input layer, we provide the data from the dataset having various features and passing to hidden layers to features classification and finally the classification results can share to output layer. When the classifier return the predictable output where it matches highest percentage with output layers. This classifier also can import sklearn.neural_network package MLPClassifier for myocardial infarction prediction. Follow the below snippet code:

NB:
In this predictive model, the naive bayes algorithm can be used for prediction of heart disease. This algorithm follows the bayes rule to hear disease prediction. It is fastest and easily predictable classifier and it calculates posterior probability events with other events and this algorithms uses mostly for text classifications. This classifier MultinomialNB is importing from sklearn.naive_bayes package. The classifier following the below snippet.

SVM:
The support vector machine classifiers in an important classifier because of it"s classification advantages. The SVM classifier while classification of features, first it can draw the margins between different classes and the hyper plane line can separate with support vectors which means the nearest classes to that hyper plane line. Here this system can separate hyper plane with POSITIVE and NEGATIVE features and select the nearest support vectors and build the training model to predict status of the disease. The following is the syntax code of prediction.

KNN:
The K-nearest neighbor classifier is a different learning classifier compare with another machine learning classifier, because it follow the Euclidian distance formula to calculate the distance. This classifier while prediction it calculates the distance between with each records then it returns the distance and store it like this follow the last record and it can return the predictable output value which distance is less to compare with all distances and that one become our myocardial infarction predictable output like POSITIVE or NEGATIVE. It is also import the KNeighborsClassifier module from this sklearn.neighbors. Here we took K value is 1 for pick the nearest distance value as output.

Prediction
This module will be executing after build the training model with respective best classifier. For myocardial infarction prediction we need to invoke predict () method with testing dataset as input. This method will be available in every classifier. By calling this function it can start to comparing with training dataset with given testing dataset with respective classifier and it returns the target column as output which is matches to near with training dataset. The below syntax is used for prediction of myocardial infarction as POSITIVE or NEGATIVE.

Conclusion
By taking the advantages of online world we have lot of medical history data is available. Extracting and analysis medical history data is become very necessary for the prediction of the diseases. Especially in myocardial infarctions, the rate of deaths due to heart attacks is increasing day by day.
This rate we can decrease by predicting disease by analyzing the heart patient"s medical history data. In this project we have propose a comparative analysis of myocardial infarction prediction using popular classification algorithms. We classify and compare the results in terms of Accuracy calculation. Here we have used KNN, SVM, NB, NN, DT and Random Forest for classifying the heart attack medical data and calculate the accuracy score. In the algorithms used the accuracy of 98% has been achieved by random forest algorithm.