A deep fast learning framework towards exploring Imbalanced data and Multi-class Drift in Evolving Data Streams

Main Article Content

K.Amrita Priya, et. al.

Abstract

Data stream classification poses great challenges in the text based data mining community towards handling evolving data stream. Identification of feature evolution and imbalanced data on the class generated is an important research area for data stream classification on employing of traditional machine learning classifiers. Class evolution and drift is the phenomenon of class emergence and disappearance. Due to class evolutions, performance of the learning model degrades drastically over time. Class evolution problem has been handled on analysis of feature drift and multi class drift. Multi-class drift occur according to probability and time, is categorized as sudden, gradual and recurring drift. Multi class drift has been captured by proposing a new framework in this paper which is named as “Deep Fast Learning Framework”. Initially feature has been extracted using ensemble of technique such as Incremental Kernel Principle Component Analysis, Incremental linear Discriminant analysis and Incremental Linear Principle Component Analysis. These techniques for feature extraction have treated as online feature extraction process. Extracted feature has been processed in the deep fast learning classifier framework which is composed of hybrid ensemble classifiers which follows chuck based ensemble and online ensemble classifiers in parallel on basis of gradual class evolution on block of data on the data streams in form of features. Base learner or classifier has been established using deep neural network to generate the fast learning model on deep analysis of the features obtained and its relationship with existing classes on continuously updating the learner by replacing the older model with newly trained model. Further Base learner will remove the emerging classes which is least utilized and detect the recurring classes on basis of the feature obtained easily. This model is effective in determining the novel classes and recurring class to features which has the possibility of multi class drift. Finally class imbalance problem has been handled on employing under sampling method for base learning model. Experimental results has proved the superiority of the proposed framework on benchmark dataset against state of art approaches on the performance measures such as precision , recall and f measure.

Article Details

Section
Articles