Signature Verification Using Support Vector Machine and Convolution Neural Network

Signature is used for recognition of an individual. Signature is considered as a mark that an individual write on a paper for his/her identity or proof. It is used as a unique feature for identifying an individual. It is highly used in social and business functions which gives rise to verification of signature. There are chances of signature getting forged. Hence, the need to identify signature as genuine of forged is utmost important. In this paper, identification of signature as genuine or forged is done using two approaches. First approach is using SVM and second is using CNN. For SVM, pre-processing of signature image is done and feature extraction is performed. Features extracted are histogram of gradient, shape, aspect ratio, bounding area, contour area and convex hull area. Further, SVM is applied to classify signature as genuine or forged and accuracy is determined. In the second approach, signature image is pre-processed, CNN is used to classify signature as genuine or forged and accuracy is determined. Dataset used here is ICDAR Dutch dataset along with 80 signatures taken from 4 people.Dutch dataset consists of 362 signature imagesand signature images taken from 4 people consists 10 genuine and 10 forged signatures which sums to 442 signature images. The proposed system provides accuracy of 86.39% using SVM and around 83.78% using CNN.


Introduction
Signature is used for identity verification of an individual. As there is a great use of signature, there are chances of signature getting forged. Forgery of signature in banks can lead to huge monetary loss. Forgery of signature can lead to huge losses for an individual too. Thus, there is a great need for recognition and verification of signature. Signature recognition and verification is treated as challenging work in biometrics. Hence, the problem can be defined as to find a solution that can differentiate between real signature along with forged signature from a given set of signatures. The important applications of signature include banking, finance, security, examination institutions, etc. where the chances for forgery is at its peak. Disguised signatures are considered difficult to identify where signature is done by genuine authors, but they are of intention to deny the signature. Thisis mostly done for fraudulent purposes. Types of forgeries are random forgery, skilled forgery, and simple forgery. In random forgeries, person uses its own signature to copy another user. Whereas skilled forgery is one where forger imitates another user. Simple forgery is one where person is familiar with only the shape of the signature, but forger does not have much practice about the shape. Distinguishing random forgeries are considered as an easy task than distinguishing skilled forgeries [1]- [2].
For classification of signature, shape of signature is considered which shows the vertical and horizontal trajectories formed due to the author's hand mobility.For online signature verification and recognition, static and dynamic features were identifiedfrom pen-based tablets.Some of the dynamic features are velocity, strokes order, acceleration, angle, pressure, etc. These features are exclusive and challenging to forge [3]. Handwritten signatures are captured by scanning images using scanner or camera. Offline signature is considered complex as there is no dynamic characteristics [4]. Scanned signature images contain noise. To remove this noise, spatial and frequency domain techniques are used. Muhammad Imran Malik et al. [2] presents signatureanalysis along with its verification by applying Speed Up Local Features. Based on the results signature verification is done.This system achieved EER of 15%. Saeede Anbaee Farimani et al. [3]presents online signature verification and recognition by applying Hidden Markov Model(HMM) technique. It segments signature curve which depends on pen's velocity value. FAR achieved is 4.8% and FRR is 5%. Clustering is another approach for verification of signature. Elements in a cluster are most identical than the elements in another cluster. This technique is also useful in other fields like face recognition and recognition of thumb impression [5].
Mohitkumar A. Joshi et al. [6] presents use of low-level key stroke features and recognition is performed using SupportVector Machine which has 3-fold cross validation. ERR achieved in this research is 15.59%. Derlin Morocho et al. [7] calculates the performance of correlative attributes forverification of signature. EER achieved ranges 5.5% to 21.2%. Features which affect the accuracy are local binary pattern, histogram of gradient, gray level co-occurrence level matrix, SURF, etc. These features help in verification of signature image [8].Research in paper [9]- [10] refers to signature recognition using SVM.This process includes preprocessing, feature extraction and further SVM is applied. Features such as centroid, centre of gravity, calculation of number of loops, normalized area are extracted [10]. In paper [11]- [12], signature verification is done using Convolution Neural Network model. Results obtained by extracting features from Deep Convolution Neural Network and SVM as classifier are better than decision tree [11].
Anjali.Ret al. [13] worked on combination of SVM and Neural Network . They calculated gray level features from signature images. Neural network is used for training of imagesby using feed forward back propagation algorithm along with Support Vector Machine. For cursive nature of signature, curve based as well as gradient based feature extraction methods are used such as histogram of curvature along with histogram of gradient [14]. Features such as underscore beneath signature, presence of dot on signature, ending strokes, etc. can be used for recognition. Next step includes splitting of signature to calculate the performance.
Splitting is performed in five categories such as left, right, bottom, up and middle. ANN including structural identification algorithm are used for prediction [15]. Nan Li et al. [16] worked to authenticate user with electronic signatures on mobiles. Coordinates, contact area, pressure and other biometric data was collected. Algorithms used for classification were SVM, Logistic Regression, Random Forest and AdaBoost. Then signature verification is done. Adaboost performance was best among other three algorithms with error rate of 2.375%. For multiscript signature verification generalized combined segmentation verification technique is used using multiscript signature dataset and SVM is used [17].
Moises Diaz et al. [18] presents algorithm for generation and detection of duplicated offline signature images.It is based onlinear as well as nonlinear transformations. It simulates human spatial cognitive map. Duplicator is classified by increasing a training sequence artificiallywhich further leads to verifying performance of signature. Muhammad Imran Malik et al. [19] used Fast Retina Key points (FREAK), which represents local features and uses human visual system such as retina. For performance comparison, Features from Accelerated Segment Test (FAST) and Speed Up Local Features (SURF) were used. This system achieved error rate of 30%.
The remainder of the paper is as follows: Section II briefs about the proposed system, Section III explains the proposed algorithm, Section IV discusses the experimental results, and Section V concludes the paper.

Proposed System
In this system, verification of signature is done using Support Vector Machine and Convolution Neural Network.This research involves identifying the genuineness of signature, so it requires a dataset of genuine and forged signature.Dataset used here is ICDAR Dutch dataset along with 80 signatures taken from 4 people. Dutch dataset consists of 362 signature images and signature images taken from 4 people consists 10 genuine and 10 forged signature of each person which sums to 442 signature images.Dutch dataset includes signature of 10 reference writers and skilled forgeries of these signatures. Figure. 1 demonstrates the working of the proposed system using SVM.

Figure. 1.
Working of the proposed system using SVM This system works in three phases namely training, testing and classification. Training phase acquires PNG images of known genuine and forged signatures. Testing phase acquires unlabelled signature images. Initially the system operates in training phase. Training phase passes through pre-processing and feature extraction steps respectively. In pre-processing, grayscale conversion occurs. Grayscale image contains only gray shadewith no colour. Reason for distinguishingthese images from other colour image is less information is givento each pixel. Then, it resizes image to [200,200] size. Next step includes thresholding. Thresholding is considered as easiest method for segmenting.Using this gray scale image, thresholding is done to form binary image. It replaces every pixel of an image by black pixel when image intensity is found to be less than a specific fixed constant.Feature extraction step uses pre-processed signature image as input and extracts features such as shape, histogram of gradient, aspect ratio, bounding area, contour area and convex hull area [22][23][24]. The description of each feature is mentioned below.
• Hu Moments: It is calculated for describing the shape of signature image. They are constant to scale, rotation, and translation. Hu moments includes set of seven numbers which are computed by applying central moments. They are constant to transformation of images. Initial six moments are proved constant to translation, scale, reflection, and rotation whereas the seventh moment's sign differs for image reflection. The seven moments are calculated using the following formula: where I is the input image, is our filter in the x-direction, and is our filter in the y-direction.Final gradient module is calculated using the formula: …(10) Finally, the orientation of the gradient for each pixel in the input image can then be computed by: …(11) Given both |G| and , we can now compute histogram of gradients, where the bin of the histogram is based on and the contribution or weight added to a given bin of the histogram is based on |G|.
• Aspect ratio of bounding rectangle: It is the width to height ratio of a bounding rectangle. Aspect ratio is used to ensure that the images are displayed in the correct ratio irrespective of its size. So if the object is scaled, the signature image inside the object's bounding box will still be displayed with the correct aspect ratio. Formula for calculating aspect ratio is as follows: Area of bounding rectangle: It calculates the area of bounding rectangle by the formula: Area = Width * Height… (13) • Contour Area: Contour area is the area inside the contour. Contour should be closed otherwise it will consider it as closed. If it is a line, it will consider the area inside the contour defined by the line. It will consider the area between first point and last point.
• Convex Hull Area: Consider a set of points S convex hull is the intersection of all half spaces that contain S. Convex hull of a set is a closed solid region which includes all the points in its interior. Convex hull C for N points , ..., is given by the expression: …(14) • HaralickFeatures: They are considered as texture features which depends on adjacency matrix. Adjacency matrix are positioned in co-ordinates (i, j), i.e. the number of times a pixel takes the value i next to a pixel with value j. Gray LevelCo-occurrence Matrix(GLCM) are used by haralick features for adjacency. Four GLCM matrixes are constructed for a single image. From this, 14 textural features were computed which are as follows: Angular Second Moment, Contrast, Correlation, Sum of Squares:Variance, Inverse Difference Moment, Sum Average, Sum Variance, Sum Entropy, Entropy, Difference Variance, Difference Entropy, Information Measures of Correlation and Maximal Correlation Coefficient [25].
After completion of training phase, system enters next phase i.e. testing phase. Dataset is split into training and testing. Training dataset contains 80% of signature images.Testing dataset contains 20% of the images. This phase acquires PNG images of unknown signature. These signature images are then passed to pre-process and feature extraction is done. Classifier predicts the label for testing images and compares with actual labels and determine accuracy. Pass a random image to the saved model and model predicts the genuineness of signature. Figure. 2 demonstrates the working of the proposed system using CNN.

Figure. 2.
Working of the proposed system using CNN Initially, pre-processing of the input signature image is done. Pre-processing includes loading the image, resizing it to 32*32 pixels, converting to array, load the pixel intensities in the range [0, 1] andupdating the image list.Dataset is partitioned into training and test dataset. Performed training and testing split of dataset by using 80% of dataset for training purpose and 20% of dataset for testing purpose. Further,2D Convolution Neural Network architecture is defined and the model is trained using Adam optimizer. Model is trained with 400 epochs.Model is evaluated and its accuracy is computed. Now pass an unknown image to the model and the model predicts the genuineness of signature.

PROPOSEDALGORITHM
Signature recognition is done using two algorithms i.e. SVM and CNN. Proposed algorithm for SVM is mentioned below.
A. SVM works on a dataset which has been partitioned into training and testing dataset. Training dataset contains known genuine and forged PNG signature images and testing dataset contains unknown signature images. The steps for proposed algorithm using SVM are as follows: • Repeat the steps A1 to A3 for all images in training dataset to prepare reference feature vector. 1. Select a PNG image from training dataset. 2. Pre-process the selected image using following steps: a.
Convert the image to grayscale image. b. Resize the image to [200,200] size. c.
Convert the image to binary form. 3. Perform feature extraction of pre-processed image using following steps: a.
Compute shape, histogram of gradient, aspect ratio, bounding area, contour area convex hull area and haralick features. 4. Now pass the image to the SVM Classifier.
• Select a PNG image from testing dataset. • Prepare testing feature vector by applying steps A2 and step A3 on testing image.
• Pass the testing dataset to the running model to evaluate the accuracy. • Using testing dataset make predictions and initialize a dictionary to accumulate computed metrics.
• Load the saved model which willfind label of signature images based on its features.
• Now pass a random image to the saved model and the model predicts the genuineness of signature.
Proposed algorithm for CNN is mentioned below. B. CNN works on a dataset which has been partitioned into training and test dataset. The steps for proposed algorithm using CNN are as follows: • Repeat the steps A5and A6 for all images in training dataset to prepare reference feature vector. 5. Select a PNG image from training dataset. 6. Pre-process the selected image using following steps: a. Load the input image from disk and resize it to 32*32 pixels. b. Scale the pixel intensities to the range [0, 1] and update the image list. c. Define the 2D Convolution Neural Network architecture and train the model using Adam optimizer. d. Train the model with 400 epochs 7. Evaluate the model and compute its accuracy of training set.
• Prepare testing feature vector by applying steps A5 and A6 on testing dataset.
• Pass the testing dataset to the running model to evaluate the accuracy.
• Calculate confusion matrix and find accuracy,specificity, and sensitivity.
• Now pass a random image to the saved CNN model and the model predicts the genuineness of signature.
• Using testing dataset make predictions and initialize a dictionary to accumulate computed metrics.

Experimentalresults
Performance comparison of signature verification using SVM and CNN is discussed below: A.
Results obtained using SVM: Overall accuracy of 86.39% is obtained usingSVM. Confusion matrix for SVM is given in Table I.  B.
Results obtained using CNN: Overall accuracy of 83.76% is obtained using CNN.Confusion matrix for CNN is given in Table II.     [7] showed EER of 21.20%. Offline Signature Verification based on low level key strokes [6] showed EER of 15.59% whereas Offline Signature Recognition using Support Vector Machine [10] showed EER of 7.16%. In this research, accuracy obtained with SVM as a classifier is 86.39% and accuracy obtained using CNN is 83.78%.

Conclusion
To avoid forgery of signature in any of the public, private or other sectors, signature is recognized as genuine or forged based on two different approaches. An approach to identify the genuineness of signature using Support Vector Machine and Convolution Neural Network is discussed here. The proposed system provides overall accuracy of 84.80% using SVM and 87.00% using CNN. Performance comparison of both the approaches is discussed. Our next objective is to improve accuracy by adding more features.