Main Article Content
The Internet has become an important resource for mankind. Explicitly information security is an interminable domain to the present world. Hence a more potent Intrusion Detection System (IDS) should be built. Machine Learning techniques are used in developing proficient models for IDS. Imbalanced Learning is a crucial task for many classification processes. Resampling training data towards a more balanced distribution is an effective way to combat this issue. There are most prevalent techniques like under sampling and oversampling.In this paper, the issues of imbalanced data distribution and high dimensionality are addressed using a novel oversampling technique and an innovative feature selection method respectively. Our work suggests a novel hybrid algorithm, HOK-SMOTE which considers an ordered weighted averaging (OWA) approach for choosing the best features from the KDD cup 99 data set and K-Means SMOTE for imbalanced learning. Here an ensemble model is compared against the hybrid algorithm. This ensemble integrates Support Vector Machine (SVM), K Nearest Neighbor (KNN), Gaussian Naïve Bayes (GNB) and Decision Tree (DT). Then weighted average voting is applied for prediction of outputs. In this work, much Experimentationwas conducted on various oversampling techniques and traditional classifiers. The results indicate that the proposed work is the most accurate one among other ML techniques. The precision, recall, F-measure, and ROC curve show notable outcomes. Hence K-Means SMOTE in parallel with ensemble learning has given satisfactory results and a precise solution to the imbalanced learning in IDS. It is ascertained whether ensemble modeling or oversampling techniques are dominating for Intrusion data set.