An Improving Method for Performance of DNN using 2-bit Quantization

Main Article Content

Sang Hyeok Kim et.al

Abstract

Background/Objectives: Recently, interest in AI(Artificial Intelligence) has increased, and  many studies are being conducted to enable AI to be used in embedded and mobile environments. Among them, quantization is one of the methods to reduce the size of the model, and most quantization of less than 8 bits cannot be implemented without additional hardware such as FPGA. With this in mind, in this paper, we propose two new algorithms that can implement 2bit quantization in software.


Methods/Statistical analysis: In this paper, we propose a packing operation that quantizes a weight consisting of 32-bit real values into 2 bits, stores four 2-bit quantization weights in one 8-bit memory, and a Masking Matrix Multiplication function that performs the calculation of the packed weight and input values. These functions operate in parallel in the GPU memory.


Findings: The quantization model using the above function showed about 16 times more memory saving and 4 times faster when comparing the operation with the existing 32bit model. Nevertheless, the DNN model showed an error of around 1% in learning using MNIST and HandWritten data, and the CNN model showed an error of around 1% in learning using EEG (Electroencephalograpy) data.


Improvements/Applications: The function used in this study is focused on the domain of DNN, and although extended to CNN, quantization could be performed only in the FC (Fully Connected) part. To apply to the convolution layer, an additional function is required, and it is necessary to check whether the difference in accuracy is small even in a more complex data set in the future.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Article Details

How to Cite
et.al, S. H. K. . (2021). An Improving Method for Performance of DNN using 2-bit Quantization. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 12(6), 597–601. Retrieved from https://turcomat.org/index.php/turkbilmat/article/view/2003
Section
Articles