AlexResNet+: A Deep Hybrid Featured Machine Learning Model for Breast Cancer Tissue Classification

- The exponential rise in cancer diseases, primarily the breast cancer has alarmed academia-industry to achieve more efficient and reliable breast cancer tissue identification and classification. Unlike classical machine learning approaches which merely focus on enhancing classification efficiency, in this paper the emphasis was made on extracting multiple deep features towards breast cancer diagnosis. To achieve it, in this paper A Deep Hybrid Featured Machine Learning Model for Breast Cancer Tissue Classification named, AlexResNet+ was developed. We used two well known and most efficient deep learning models, AlexNet and shorted ResNet50 deep learning concepts for deep feature extraction. To retain high dimensional deep features while retaining optimal computational efficiency, we applied AlexNet with five convolutional layers


I. Introduction
In the last few years, cancer has emerged as a deadly and major threat to the humanity across the world. Recently, Indian populace registration information [1] indicated that almost 8 Lakh patients die every year due to cancer and has been the second largest chronic disease in India claiming human life. The Indian Council of Medical Research (ICMR) studied on cancer cases across Indian territory and found that in 2016 there were almost 14 Lakhs cases reported, which could be even larger in real-world as there are many cases when remained unreported to the healthcare agencies. ICMR study also revealed that till 2019 the rate of cancer diagnosis was 25.8 per lakh population which can increase up to 35 per year by or before 2029. India is amongst the top three countries including USA and China which has the higher cancer diagnosis globally. Recent report also reveals that Kerala, Tamil Nadu, Delhi are the key states where almost 2000 breast cancer cases are reported every day. Amongst these figures, almost 1200 cases are found in advanced or the later stage that eventually reduces survival rate significantly (4 to 17 times). On the other hand, the cost of late detection or diagnosis results into 1.5 to 2 times higher cost. Considering cancer types, after lung cancer, breast cancer has been identified as the second largest cancer type amongst the women causing deaths. A recent study revealed that in 2015 almost 5,00,000 women died because of breast cancer [2]. World Health Organization (WHO) too indicates that approximately 1.5 million of women might die because of breast cancer [2] [3]. America being one of the most developed country with the best kind of medical facilities too witnessed 2,52,710 breast cancer patients and 40,610 deaths in 2017 due to breast cancer [2]. Breast cancer can be defined as the multiplication of dead-cells or masses within or Amongst the major deep learning models, convolutional neural network (CNN) and its variants have been used in varied medical image analysis purposes [20][21][22]. Deep learning approaches act as (deep) feature extractor to perform histopathological image classification towards diagnostic decision. Undeniably, deep learning methods like AlexNet [21] and ResNet [20] have performed better than the major existing machine as well as deep learning approaches. Their efficiency turns out to be more polished due to independency towards additional feature extractor [23]. However, the classification efficiency of these approaches primarily depends on the features extracted and subsequently used [23]. Most if the deep learning models have been designed with the significantly large dataset so as to enhance its learning ability; however, in practical world it requires certain pretrained model to perform accurate classification. It makes classification dependent on the pretrained features. On the contrary, in practical world where a patient can have a limited number of mammography images or biopsy histopathological Research Article Vol.12 No.6 (2021), 2420-2438 tissue images. In such case applying certain secondary pre-trained model in conjunction with patients own (say, primary) set of (limited) images may give false positive or false negative result. It can make overall system unreliable and can have adverse impact on diagnosis decision. Moreover, merely applying too deep features can lead overfitting and information redundancy that eventually reduces overall performance. Literatures reveal that extracting deep features and classifying it using certain efficient machine learning model can yield better performance, especially with low data size, which is practical in real-world applications [24].
Considering above stated key inferences, in this paper a highly robust deep hybrid featured machine learning model for breast cancer tissue classification (AlexResNet+) is developed. As the name indicates, our proposed model employs two well-known deep learning methods AlexNet CNN and ResNet as feature extractor to retrieve optimal set of best features for further classification. Retrieving the hybrid features we performed two-class classification using SVM with radial basis function (RBF) kernel. Employing a 10-fold cross validation-based classification our proposed AlexResNet+ model achieves the accuracy of 95.87%, precision 0.9760, sensitivity 1.0, specificity 0.9621, F-Measure 0.9878 and AUC of 0. 960. The overall proposed model is developed using MATLAB 2019b platform where the simulation results with DDSM dataset revealed that the proposed model outperforms major at-hand solutions towards breast cancer tissue classification.
The other sections contain the following. Section II discusses some of the key literatures pertaining to breast cancer detection. Section III discusses the research questions, while the problem formulation is discussed in Section IV. Section V presents the overall proposed system and its implementation, while the simulation results are given in Section VI. Section VII presents the overall research conclusion. References used in this manuscript are given at the last.

II. Related Work
Demir et al. [27] proposed a cellular level diagnosis model for automatic breast cancer detection using biopsy images. Bergmeir et al. [28] at first performed textural and GLCM feature extraction using local histograms. Extracting features, authors applied quasi supervised learning concept to perform two-class classification. It could achieve the maximum accuracy of 88%. Similarly, Mouelhi et al. [29] exploited Haralick's textures features [30], histogram of oriented gradients (HOG), and color component based statistical moments (CCSM) features to diagnose microscopic biopsy images. Huang et al. [31] performed segmentation followed by textural feature extraction and classification using SVM, where the highest accuracy was obtained as 92.8%. Landini et al. [32] performed morphologic characterization of cell neighborhood in neoplastic and preneoplastic tissue of microscopic biopsy images. Authors used watershed transforms for cell and nuclei region segmentation. k-NN classifier yielded 83% accuracy to classify images as dysplastic and neoplastic classes. Sinha et al. [33] used key features like eccentricity, area ratio, compactness, average values of color components, energy entropy, correlation, and area of cells and nucleus. Classification using Bayesian, -NN, ANN, and SVM showed the accuracy of 82.3%, 70.60%, 94.1%, and 94.1%, respectively. Kasmin et al. [34] considered features like area, perimeter, convex area, solidity, major axis length, orientation filled area, eccentricity, cell-ratio, and nucleus area, circularity, and mean intensity of cytoplasm to perform breast cancer diagnosis. -NN and ANN classifiers yielded the accuracy of 86% and 92%, respectively. George et al. [35] refined nuclei segmentation using watershed algorithm, which was followed by textural and shape features extraction. Authors used ANN classifier to perform two-class classification. Filipczuk et al. [36] recommended to use ensemble learning for higher classification accuracy (98.51%) towards breast cancer diagnosis. George et al. [35] trained their model with 92 images which bagged accuracy of 97.15% using ANN. Brook et al. [37] and Zhang et al., [38] performed three class classification; normal, in situ carcinoma and invasive carcinoma over the data given in [39]. Brook et al. [37] dichotomized the histopathological images using different threshold values for further classification using SVM which achieved the accuracy of 93.4%. Zhang et al. [38] used arbitrarily fed subsets of curvelet transform and local binary pattern (LBP) features to be classified with SVM for breast cancer identification. Authors found the highest accuracy of 97%. Carvalho et al. [40] on the other hand proposed a hybrid features for breast cancer classification. Kumari [41] used certain nominal set of attributes which were later classified using K-NN classifier to perform breast cancer identification. Tapak et al. [42]  [99] applied deep learning-based mass classification and localization over mammogram images to classify it as mass and non-mass. It could achieve the sensitivity of 85%. Jadoon et al.
[100] applied CNN for three-class (normal, malignant, and benign) mammogram classification. Authors applied Discrete Wavelet (DW) and Curvelet transform based CNN for cancer detection. Authors found that inclusion of external features can achieve the prediction accuracy of 81.83% *DW-CNN) and 83.74% (CT-CNN).

III.Research Questions
Considering above discussed research intends towards breast cancer detection, different existing systems and their allied and strength in this research we defined certain research questions signifying the query whether a novel solution can be derived by alleviating at hand problems. We defined research questions based on the identified possible solution and its respective (targeted) significance. These research questions are:

RQ1: Can the use of hybrid deep features from different deep learning models like AlexNet-CNN and shorted
ResNet enable more efficient and reliable breast cancer diagnosis? RQ2: Can the use of AlexNet-CNN with 5CONV-3FC architecture and shorted ResNet50 deep learning models can generate more efficient feature-set for breast cancer? RQ3: Can the amalgamation of AlexNet-CNN and shorted ResNet deep features in conjunction with SVM-RBF classifier yield better breast cancer diagnosis solution? RQ4: Can AlexRest+ model be called reliable and more efficient solution for images-based breast cancer diagnosis?

IV. Problem Formulation
This research at first explored a significantly large number of literatures pertaining to medical images (histopathological, mammographic images, micro-biopsy images, etc). The secondary data-based assessment revealed that though a large number of researches have been done towards breast cancer diagnosis; however, there are large differences in performances by different researchers with the same machine learning methods. In other words, approaches with the same machine learning method in different researches exhibit difference performance in terms of accuracy, specificity, sensitivity and AUC. It indicates suspicions towards the results presented and their generalization. Additionally, it has been found that most of the existing systems merely focus on implementing different machine learning algorithms, and very few efforts are made towards featureenrichment. On the contrary, the efficiency of a medical image-based CAD solution significantly depends on the inherent-features. In this case, classifier-centric existing methods can be hypothesized as limited. Similarly, those machine learning approaches which merely focus on using large set of images to train the model can undergo overfitting and information redundancy that as a result can confine the performance. Considering deep learning models, the secondary data revealed that the at hand deep models like AlexNet, ResNet, RCNN, VGGNet. DenseNetetc have been applied towards breast cancer diagnosis; however, these approaches primarily contribute towards better deep feature extraction and avoids classical mechanisms like pre-processing, ROI segmentation, feature extraction and selection. The features obtained by aforesaid deep learning models as amalgamated feature set (say, fused features) can be of great significance. Thus, obtaining an amalgamated feature set, it can be classified with any machine learning algorithm for two-class classification. This mechanism can not only reduce the computational overheads caused due to pre-processing, nucleus segmentation and feature extraction, but also retains sufficiently large (deep) features towards two-class classification to classify it as benign and malignant. Thus, taking into consideration of the above-mentioned key facts and scopes, this paper proposes a first of its kind hybrid deep-feature learning assisted breast cancer tissue identification model for early diagnosis. More specifically, in this paper we apply two well known and most robust deep learning models, AlexNet and shorted ResNet50 as distinct feature extractor. The extracted features from each deep learning model are amalgamated together to yield a composite feature set named AlexResNet+, which is subsequently processed for two-class classification using SVM-RBF classifier. Noticeably, the prime intends behind AlexResNet+ model was to extract and retain most significant features for tissue pattern analysis and classification. Thus, training SVM-RBF over AlexResNet+, which has been found more efficient in many literatures two-class classification has been done to classify each mammographic image from DDMS as benign and malignant. This approach as a hybrid deep-feature learning concept in conjunction with machine learning model can yield more efficient and reliable performance towards breast cancer diagnosis.
The detailed discussion of the overall proposed model is given in the subsequent sections.

V. Proposed System
The overall proposed deep hybrid featured machine learning model for breast cancer tissue identification encompasses key three phases. These are: Phase-1Data Collection and Augmentation, Phase-2AlexResNet+ Feature extraction, Phase-3Feature Fusion, and Phase-4SVM-RBF based two class classification.
The detailed discussion of these sequential implementation measures is given in the subsequent sections.

A. Data Collection and Augmentation
To assess performance of the proposed breast cancer tissue identification and classification we applied a well-known standard benchmark data named Digital Database for Screening Mammography (DDSM) [101]. This mammogram data is collected and standardized by South Florida University [102]. DDSM is collected to represent original breast data, prepared with the average dimension of 3000 × 4800 pixel (size), where the resolution was maintained at 42 microns with 16 bits. The DDSM database comprises a total of 2,620 breast's scanned mammography images, which has been classified into distinct 43 volumes. In this data the benign and

Research Article
Vol. 12 No.6 (2021), 2420-2438 the malignant masses are identified and annotated by expert radiologists. A snippet of the single sample data from DDSM for both benign and malignant breast cancer image is given in Table I. For data augmentation, we at first transformed mammography images using affine transformation so as to avoid any insertion biases on classification or prediction using other morphological processes. Though, the other approach applied towards data augmentation was patching the mammography images. However, this approach generates the effect of selecting sections or the pieces of an image with the similar structure, but belongs to the images which are of the different classes. In order to convert all microscopic breast mass images into common space so as to enable better quantitative analysis and therefore we performed normalization of the amount of mass information or stain information on the issue as per [103]. Here, for each of the labelled images (i.e., benign and malignant) we performed random colour augmentations. In our proposed work, we down-sampled each original image to the 1024 × 768 pixel dimension. Additionally, from the down-sampled images we extracted crops of 150 × 75 pixels. Observing that the obtained data is enough, each mammography image was represented by 20 crops, where the crops were further encoded into 20 descriptors. Subsequently, the set of descriptors were amalgamated by means of 3-norm pooling that converts it into a single descriptor. Mathematically, we use of (1) to retrieve the single descriptor.
We assigned the value of p as 3 as per [104]. Here, states the total number of crops, while the descriptor of a crop is indicated as and refers the pooled descriptor of each mammographic image. Noticeably, the pnorm of a vector enables the average value of = 1 while the maximum value being → ∞. Consequently, for each of the original mammography image it yields large number of descriptors that help making optimal set of features for further classification. Once performing data augmentation, we performed deep feature extraction using AlexNet CNN and shorted ResNet50 deep learning models. The detailed discussion is given in the subsequent sections.

B. AlexResNet+ Feature extraction
In this paper, we focused on amalgamating deep features obtained by AlexNet CNN and ResNet50deep networks. Here, our prime intend is to extract and use deep (AlexNet with 4096 kernel) and diverse depth features (shorted residual ResNet50) as combined feature vector to perform more reliable and efficient breast cancer classification. The details of these deep learning models are given as follows:

Research Article
Vol. 12 No.6 (2021), [2420][2421][2422][2423][2424][2425][2426][2427][2428][2429][2430][2431][2432][2433][2434][2435][2436][2437][2438] AlexNet is considered as the first CNN model which exhibited better performance than the major at-hand deep learning models for object detection and classification. Although, AlexNet CNN was designed to perform different object classification in conjunction with the pretrained model; however, its robustness enables it to be used as a transferable learning model which can efficiently be used for breast cancer mass image feature extraction. Unlike classical CNN which retrieves 256 dimensional features, we retrieved 4096 dimensional features at the FC layers which provides more depth information to perform better decision. In our proposed AlexNet CNN design we employed five convolutional layers (i.e., CONV1, CONV2, CONV3, CONV4 and CONV5) and two FC layers (FC6 and FC7). Noticeably, amongst the three possible fully connected layers (FC6, FC7 and FC8), FC8 had the 1024-dimensional features. On the contrary, FC6 and FC7 layers had the 4096dimensional features, which are higher than FC8, and therefore we considered only FC6 and FC7 features for further classification. The classical design of AlexNet-CNN encompasses eight layers containing five convolutional layers and fully connected layers. The overall design of the proposed AlexNet CNN is given in Fig. 1  In our proposed model, the augmented images were directly fed as input to the AlexNet with 96 neurons (say, first CONV layer of AlexNet). Here, each CONV layer generated distinct features, which were subsequently performed feature scaling and mean subtraction. The outputs were subsequently processed for resizing and were further fed to the subsequent layers.

 CONV
Convolutional layer or CONV is the amalgamation of two distinct filters (horizontal and vertical filters) capable of extracting and embedding feature patterns for the input images. The neurons or the kernel specification at CONV layer are like, CONV1-96 kernels, CONV2-256 kernels, CONV3-384 kernels, CONV4-384 kernels and CONV5-256 kernels. In at hand problem of breast cancer mammographic image feature extraction each neuron extracted feature map which shares same set of weights (W) and bias (b). These values help neurons in a feature map to identify the similar feature. Thus, CONV with different neurons (Fig. 1) enabled varied sets of bias and weight values to extract different local features. Here, CONV layer filters the input mammographic (augmented images) and retrieves the final feature vector as output. We obtained consecutive features with different neurons and zero-padding of 2 and stride of 4. In the proposed design the first layer of the deep network was fed as 224 × 224 size with 96 kernels (with the size of 11 × 11 and stride of 4 pixel). Here, the depth of 96 kernels were equal to the total number of channels of the input image. Subsequently, performing local response normalization and max-pooling the output of the first layer was fed as input to the second layer. The second layer performed filtering with 256 kernels of size 5 × 5 × 96. The 3 rd , 4 th and 5 th layers are connected to one another

Research Article
Vol. 12 No.6 (2021), 2420-2438 without any normalization layer. The third convolutional layer has 384 kernels of size 3 × 3 × 256, while the fourth layer has 384 kernels of size 3 × 3 × 384. Over consecutive five convolutional layers (CONV), two fully connected (FC) layers were applied with 4096 dimensional kernels. Here, we maintained two FC layers as the at hand problem pertains to the two-class classification.

 Max-Pooling Layers
In our proposed model, we applied Max-Pooling layer as a feature selection layers that iteratively reduces the spatial resolution of each feature map obtained as a result of CONV process. Moreover, pooling layer helps in minimizing the number of parameters and computation. It is achieved by means of local averaging and a subsampling technique. It also helps in avoiding the over-fitting problem. We applied Max-pooling to retrieve the translation-invariant representations in the input data. It down-sampled the latent representation by means of a constant component by applying the highest value over non-overlapping sub-region. Max-pooling considers sparsity over the hidden representation by eliminating all non-maximal values in non-overlapping sub-space, and therefore it improves feature detectors to avoid insignificant solutions to retain for further computation. In the same manner, for reconstruction the derived sparse latent code reduces the number of filters to decode each pixel. It makes our proposed model more computationally efficient. We used one Max-pooling layer after each CONV layer, where each layer is characterized for 3×3 receptive field with a stride of 3.

 ReLU Layers
We applied a supplementary layer named regularization ReLU that primarily acts as an activation function. ReLU layer encompasses a non-linear element-wise function which acts like a layer. In our proposed model, we applied three ReLU layers. With input , ReLU retrieves the output for the neuron q(y) as y if y > 0 and (δ × y) if y <= 0. Noticeably, δ states whether the negative components are needed to be avoided by performing multiplication with a slope (here, 0.01…) or fixing it to 0. To enable our proposed model functional as native ReLU function q(y) = max(0, y), we performed activation at zero threshold value and hence assigned δ = 0.

 FC Layers
Here, FC layers act as the last layer(s) of the AlexNet CNN and performs high-level reasoning. Though, in classical AlexNet model, FC layer act as classification layer, we merely use this layer to obtain the final feature vector to be used for further classification using SVM-RBF. Functionally, this layer receives the set of neurons also called the feature vectors from the previous layers (i.e., CONV) and maps it to the all connected neurons. Eventually it generates a one-dimensional feature vector to be used for further classification. Considering highdimensional features and their respective significance towards classification we applied FC6 and FC7 layers which availed 4096 dimensional features for further classification. Noticeably, as single input feature for feature fusion we applied FC6 features. Thus, the final single dimensional feature obtained from AlexNet CNN was .

b). ResNet50
ResNet is also called the residual network. ResNet deep model comprises restructured CONV layers to learn residual functions in conjunction with the inputs. Unlike classical deep learning models, the residual networks especially ResNet are easier to be optimised so as to retrieve more diverse and depth information (say, feature) [19]. In typical implementation, a "residual block (RB)" is connected for each CONV in the form of "shorted connection" in such way that it runs in parallel to the CONV layers to perform identity mapping. The output of the CONV is subsequently added to the output of the shortcut branch and thus the result gets propagated to the next block. In fact, beside the use of aforesaid "shorted connection", the network architecture of ResNet is evolved from VGGNet. Here, each CONV has the small kernel values of size 3 × 3. For design we follow the following rules: Rule-1 For the same size of output feature map, the layer should have the similar number of filters Rule-2 Reducing the size of feature map by 50% (with CONV with stride 2), the total number of filters gets doubled, which makes it computationally more efficient and less time-consuming per layer. Rule-3The ResNet architecture(s) can have the different depth based on the layer size, which vary in between 34 and 152.

Research Article
Vol.12 No.6 (2021), 2420-2438 In our proposed ResNet deep learning model the breast's mammographic images were fed as input which retrieved high-dimensional features. In our proposed model we employed pre-activation modified ResNet50, which is well-known for its efficiency towards "training by learning" using residual functions. We designed ResNet model with different activation functions which enabled it to process multiple mammographic mass images altogether for concurrent feature extraction and leaning. Here, ResNet50 retrieved different features from different angles or augmented input images. It resized or used augmented images in such manner to maintain150 × 75 × 3 dimensions. Once training over the extracted features for each breast mammography images, our proposed ResNet50 exhibited multi-layer feature extraction and subsequent learning. To retain better efficiency and swift convergence we performed retraining of the inputs by adding a residual block given as (2).
In above equation (2), and states the input and output vectors of the residual layer, correspondingly. Here, , states the residual mapping information amongst the input images to be learnt. The functional architecture of the residual map learning is given in Fig. 2. As depicted ResNet50 was applied with added shortcut solves loss-function without any additional parameters and computational overheads.

F(x)
F(x)+x X Identity

Figure 2. Identity block
Noticeably, in our proposed ResNet model we performed batch normalization over the input images. In other words, we applied an additionally layer called BatchNormalization as the preceding layer with three channels. Here, it normalized each input channel (we used three channels) across a mini-batch. To enhance training efficiency and minimize the sensitivity towards network initialization, we applied BatchNormalization layer in between CONV and non-linearity (i.e., ReLu layer). It enabled normalization of the activations of each channel by subtracting the mini-batch average and dividing the mini-batch standard deviation. Subsequently, it shifted the input by a learnable offset value, which is later scaled by a learnable scale factor. Thus, our modified ResNet50 model intended to retain significant features with low computation. Though, classical deep learning models including AlexNet and ResNet50 apply a final dense layer with Softmax activation to perform classification, we merely obtained the final deep features to be further used for two-class classification using SVM-RBF machine learning algorithm. The overall implementation schematic of the used ResNet50 model with hidden units is given in Fig. 3. As depicted here, we applied "residual block" for each stacked layer that eventually retrieves multi-layered features to be used for further classification.

D. SVMbased two-class classification
SVM is one of the most used machine learning methods for pattern classification. The computational efficiency and robustness make it suitable for classification purposes including text classification, target detection, image processing etc. Being a supervised learning concept, SVM learns over the input patterns and behaves as non-probabilistic binary classifier. To classify the inputs, it reduces the generalization error over the unobserved instances by means of a structural risk reduction concept. Here, the support vector represents a subset of the training set which retrieves the boundary values called hyper-place in between two classes having distinct features or the patterns. We applied (4) to perform two-class classification.
In (4), ( ) states the non-linear transform where the emphasis is made on retrieving the suitable weight w and bias value . In (4), Y ′ is estimated by reducing the regression-risk (5), iteratively.
In (5), C states a penalty factor, while γ presents cost function, correspondingly. The weight values are obtained using (6).
To be noted, in above equation the components α and α * states a relaxation factor, often called Lagrange multipliers, which are always selected as non-zero. The final output of SVM be (7).
In (7), , presents the kernel function. In general, there are three key kernel functions, linear, polynomial and radial basis function. In our proposed model, we have applied SVM with the different kernel functions such as linear, polynomial and RBF. Training over the extracted features we performed testing for random test cases or images. In this process SVM classified each test input or breast mammographic image as benign and malignant. The overall simulation results and allied inferences are discussed in the subsequent sections.

VI. Results and Discussion
This research considered a few key facts as key driving force to design a novel and first of its kind hybriddeep feature assisted machine learning model for breast cancer tissue identification and classification. We consider the following key facts to design a novel and more reliable solution; (i) most of the existing medical images and machine learning based breast cancer detection system primarily focus on classifier-centric efforts, (ii) merely a few researches considers feature aspect to perform accurate cancer tissue identification and classification, (iii) most of those approaches employing pre-processing, ROI segmentation, feature extraction and classification undergo high computational overhead, which might be even more complex for large scale data, (iv) most of the classical deep learning models apply shallow features to perform classification, and (v) depth performance analyses of the different researches employing same classifier or same data has exhibited varied performance, indicating biasedness of results published. On the contrary, most of the researches, especially performing deep learning-based breast cancer detection and classification has indicated that the "deep-features" have direct impact on the classification accuracy and reliability of classification. Therefore, considering above facts, we designed a first of its kind solution which employs significantly deep features from two different wellknown deep learning models to perform breast cancer tissue identification and classification. This research puts foundation on the fact that applying or fusing deep features from highly efficient deep feature extractor and subsequently classifying it with certain machine learning model can yield an optimal solution for breast cancer tissue identification and classification. In our proposed model at first, we surveyed significantly large number of

Research Article
Vol. 12 No.6 (2021), 2420-2438 related existing systems and identified experiment setup with best of its class feature extractor and classifier. We identified AlexNet and ResNet as the two deep feature extractors which has performed better than other state-ofart deep learning methods. Similarly, considering the ease of implementation SVM, especially with RBF kernel was identified as machine learning classifier to perform two-class classification. Noticeably, in this research the prime motive was to employ AlexNet-CNN and ResNet50 as the feature extractor where the first could obtain high dimensional features (i.e., 4096 dimensional features at FC6) and diverse feature set (using shorted residual deep learning concept). These features as combined feature vector can provide sufficiently enough features to make optimal classification decision.
The proposed model involved four consecutive phases, including data collection and augmentation, feature extraction, feature fusion and classification. In this paper, we applied DDSM mammographic dataset [101] to assess efficiency of the proposed system and respective breast cancer tissue identification. We performed data augmentation over the input mammography breast images, which normalized images into a common dimensional space. Subsequently, the augmented images were fed as input to both AlexNet-CNN and ResNet50 model individually. Here, we designed AlexNet-CNN with five CONV layers and 3 FC layers, though we considered merely two FC layers FC6 and FC7 which contain 4096 dimensional features for further classification. In our proposed AlexNet-CNN design we assigned zero padding as 2, while stride of 4. Additionally, we applied Max-Pooling and ReLu layers to enhance feature retention and computational efficiency. Here, we applied 50% dropout (say, 0.5 dropout) with Max-pooling, which retained only (50%) significant features for further computation. We applied the learning rate of 0.0001. Thus, AlexNet-CNN retrieved FC6 features (i.e., ) were retained as first feature for breast cancer tissue identification. Subsequently, we applied modified ResNet deep learning model with 45 layers, (say, ResNet50) as residual network to extract more significant deep features 45 . In ResNet50 design we considered stride of 2. Noticeably, in both deep feature extractor we applied ADAM learning model with the learning rate of 0.0001. We considered a total of 200 number of epochs to train the model. Thus, obtaining the feature set from AlexNet-CNN and ResNet50 (i.e., and 45 ) distinctly, we performed feature fusion so as to use both features for better learning. To enable computationally efficient operating condition, we concatenated both features that gave rise to a final feature vector , which was given as input to the machine learning model. Interestingly, our proposed model employed both deep-features as well as machine learning model for breast cancer tissue identification and subsequent classification. In our proposed work, we applied SVM-RBF classifier with 10-fold cross-validation. Noticeably, we applied DDSM mammography breast images [101] to perform breast cancer tissue identification. Being a feature-sensitive approach, we performed 10-fold cross validation based classification with the different features including AlexNet-CNN ( ), ResNet50 ( 45 ) and Hybrid distinctly. To assess performance, we estimated confusion matrix that later derived the performance variables like accuracy, precision, recall (sensitivity), specificity and F-Measure. The performance variables and their derivation are given in Table I1.
The overall performance analysis has been done in two ways, intra-model comparison and inter-model comparison.
Here, intra-model comparison performs performance characterization for the different features ( , 45 and ) with SVM-RBF. On the contrary, inter-model assessment compares the performance by our proposed hybrid deep featured (AlexRest+ featured) breast cancer tissue classification model with other existing state-of-art algorithms. The detailed discussion of the result assessment and allied inferences is given as follows:

A. Intra-Model Performance Assessment
This assessment analyses the performance with three different features , 45 and and SVM variants. Exploring in depth it has been found that amongst the major machine learning algorithms SVM has been applied in a significantly large efforts where it has performed well. Authors [31] . Considering it as motivation, we performed two-class classification using 10-fold cross-validation based SVM classifier. We tested performance with SVM in conjunction with the different learning methods or kernel functions such as linear, polynomial and RBF. The results obtained for the targeted breast cancer tissue classification using SVM-Linear (Table III), SVM-Polynomial (Table IV) and SVM-RBF (Table V).    Vol.12 No.6 (2021), 2420-2438 robustness of AlexResNet+ features towards breast cancer tissue identification and classification. Considering best performance for the different classifiers (in conjunction with the different features), we obtain the following (Table VI). have shown better accuracy; however, authors failed in characterising the performance in terms of repeated performance measure such as precision, sensitivity, F-Measure or F-Score. It also failed in addressing key concerns of data sample (i.e., the number of samples), feature-sensitiveness etc. Digging into depth it has been found that the majority of machine learning methods, as indicated above employ certain pre-processing followed by ROI segmentation [62] and feature extraction [35 [36]. The probability that such methods can impose significantly high computational overheads and time consumption cannot be ignored. The affirmation of such limitations can confine robustness of these methods. On the other hand, majority of works have been trained over a very small data with predefined ROI and allied feature extraction. Despite having higher accuracy, such approaches cannot be generalized towards a global solution. On the other hand, most of these approaches failed in assessing respective performance in terms of precision, recall (sensitivity) and F-Measure which show how efficient the model can be under repeated test and with varying feature(s). On the contrary, realizing such facts, we examined our proposed model in terms of the different performance variables that indicates higher reliability towards at hand breast cancer tissue classification. Authors in [35] performed nuclei segmentation followed by feature extraction and classification. Authors in [36] though stated that ensemble learning classifier can achieve higher accuracy with maximum voting concept; however, didn't bother of enhancing the input features which does have direct impact on accuracy and reliability. Authors in [43] [46][47][72] who claimed to have achieved better performance too was classifier-oriented efforts, and didn't bother on retaining higher feature patentability to make more efficient classification. The work in [44] too was a text-data mining-based   The simulation results in this paper (Table III to Table V) affirms satisfactory performance by ResNet50, which was designed in such manner that it retains suitable features while maintaining low computational overheads. Thus, amalgamating the deep features from AlexNet CNN and ResNet50deep learning models have provided optimal feature set for further two-class classification using SVM-RBF. The overall simulation results and corresponding performance inferences indicate that the proposed model can be called as more reliable and ready-to-use solution towards breast cancer tissue identification and classification. Being deep features-based approach, it can easily be applied over fine-grained or micro-histopathological images as well where segmenting ROI with classical methods is very difficult and inaccurate.
Considering overall research outcomes and allied inferences this study confirmed that the use of hybrid deep features from different deep learning models like AlexNet-CNN and shorted ResNet enable more efficient and reliable breast cancer diagnosis solution. The comparative performance in Table III, Table IV and Table V, and observing the individual performance with different features as well as the combined features (i.e., AlexResNet+) it can be found that the amalgamation of these two deep features can yield better performance. It approves the acceptance of the RQ1. To achieve such performance, the role of computationally efficient and optimistic design can't be ignored. It affirms that the use of AlexNet-CNN with 5CONV-3FC architecture and shorted ResNet50 deep learning models can generate more efficient feature-set for breast cancer diagnosis. Thus, RQ2 is accepted affirmatively. Re-iterating the performance assessment in Table V, it can easily be found that SVM-RBF classifier can achieve better performance with AlexResNet+ feature. It confirms the affirmative acceptance of RQ3. Overall performance in terms of both intra-model as well as inter-model (performance) analysis by assessing respective strengths as well as limitations, this study confirms that the proposed AlexRest+ model be called reliable and more efficient solution for medical mammogram images-based breast cancer diagnosis. It affirms acceptance of RQ4.

VII. Conclusion
In this paper a novel and first of its kind hybrid deep feature-based machine learning model is developed for breast cancer tissue identification and classification. Unlike classical approaches, the proposed model amalgamates the strengths of both deep learning as well as machine learning algorithms which strengthen it to perform more reliable and efficient performance. The proposed system at first applies two well-known and so far, identified as the most efficient deep learning models named AlexNet and shorted ResNet50 to perform deep feature extraction. Noticeably, this research hypothesized that the use of deep features and their strategic amalgamation can help achieving more reliable breast cancer tissue pattern learning and subsequent classification. In this relation, the distinct features were obtained from AlexNet and shorted ResNet50, where the first was designed with five convolutional layers and three fully connected layers, while the later was designed as a modified model. The fused feature was trained over SVM with RBF kernel function that classified each breast cancer or mammogram image as benign or malignant. The simulation results with DDSM dataset revealed that the proposed hybrid feature based model achieves the better performance in terms of accuracy (95.87%), precision (0.9760), sensitivity (1,00), F-Measure (0.9878), specificity (0.9621) and AUC 0.96. The relative

Research Article
Vol. 12 No.6 (2021), 2420-2438 performance by AlexNet CNN and ResNet50 model revealed that the proposed hybrid feature based model achieves better performance than the other state-of-art approaches. This could be contributed because of more deep features that in conjunction with shorted ResNet50 deep model enabled optimal set of features for further classification and allied decision making. The superior performance by AlexResNet+ feature and SVM-RBF classifier can be applied for real-world CAD applications, especially towards breast cancer tissue classification (as benign or malignant). Though, this approach achieved better performance, in future effort(s) can be made to use more efficient machine learning models for higher accuracy.