Implementation Of A Hybrid Color Image Compression Technique Using Principal Component Analysis And Discrete Tchebichef Transform

—The proliferation of scientific and technological demands of high-resolution multimedia contenthas extremely increased the data volume in enterprise data centers and servers on the internet.The characterization of high-resolution picturescreates a great challenge to transfer files with the immensely colossal volume of image data over communication networks. The uploading/downloading time of large images has always been a keyproblemon the Internet. In addition to data communicationissues, high-resolution photo consumes larger storage capacity. Therefore, compression is an almost inevitable process towards the reduction of the transmission time and/or storage capacity requirements of images.Principle component analysis (PCA) and Discrete Tchebichef Transform (DTT) algorithmsare often employed for image compression in several references. In this paper, we propose a hybrid color image compression approachbased on PCA and DTT algorithms (PCADTT), which integrates the benefits of both PCA and DTT algorithms.Our hybrid approach exploits (i) PCA to reduce the dimensionality of the image; and (ii) DTT algorithm to enhance the image quality.The proposed technique has been assessed and related to a compression method that integrates DTT with a singular value decomposition (SVD) scheme (DTTSVD)using different performancemeasures including compression time (CT),compression ratio (CR), structural similarity index measure (SSIM), peak signal-to-noise ratio (PSNR), and universal quality index (UQI). The experimental results reveal that our proposed method outperformsthe existing method for all kinds of image content at a high compression ratio with lower computational complexity and retaining thequality of imagedata


Introduction
The amount of data produced and communicated by public service industries, nonprofit sectors, business organizations, and scientific research, has augmented immeasurably [1].Each day our cyber world generatesapproximately 2.5 quintillion bytes of data (i.e. 1 quintillion byte= 1 billion gigabytes) [2].International Data Corporation expects that our global datasphere-the digital data we generate, capture, imitate, and consumewill increase from around 40 zettabytes of data in 2019 to 175 zettabytes in 2025 (1 zettabyte = 1 trillion gigabytes) [3].With this irresistiblequantity of intricate and a multiplicity of data pouring from anydevice,any-time, and anywhere, there is undeniably an era of Big Dataa phenomenon also called the Data Deluge.These data comprise textual content (i.e.unstructured, semi-structured, and structured), to multimedia content (e.g.audio, images, and videos) on heterogeneous platforms such as sensors networks, social media websites, machine-to-machine communications, the internet of things, and numerous safety-critical cipher physical systems [4].
In this advanced cyber world with the data deluge, transferring a plethora of multimedia products at every second, contributing to circumscribed bandwidth consumption and storage space.Image compression is the preeminent way to create applications to characterize a picture with smaller number of bits without degrading its quality.This can be realized by removing irrelevancies and redundancies that present in the images.It converts the actualimage to its compressed form by the recognition and application of patterns present in the image data.These applications play a pivotal role in severalscientific domainssuch as communication, medical imaging, astrophysics, satellite, videoconferencing, etc. [5, 6].
Generally, image compression approaches are categorized into two distinct types, which are lossless and lossy methods [7].Lossless methods compress images without loss of essential data but have low CR [8].In lossless compression, complete data fidelity is assured after reconstruction, but overall CR is restricted between 2:1 and 3:1.These techniques hold the data integrity throughout the entire compression and decompression has been employed as anessentialprocess in the application of satellite communications [9], remote sensing [10,11], medical imaging [12], etc.Lossy compression methods can compress images with great CRat the cost of information integrity [13].In lossy compression, the reconstructed (compressed) image comprises some deprivation related to the actual one but it is nearly close to it.This methodhas been generallyemployed in mobile devices, World Wide Web,digital cameras,etc. [14].
Primarily, digital images are pigeonholed by three different redundancies: coding redundancy, spatial redundancy, and psychovisual redundancy [15].The compression techniques utilize these redundancies to reduce the dimensionality of pictures [16].The coding redundancy also known as statistical redundancy refers to the utilization of adjustable length code to compare the statistics of the inputpicture.Spatial redundancy reveals a fact that the gray value of one pixel may be partially measured by values of other pixels.Psychovisual redundancy relies on the human insight of the image data [17].Eliminating psychovisual redundancy includesnoise in reconstructedpictures;therefore, this process is circumvented inthe lossless compression method.
There are severalmethods have been proposed to compress the image as well as to enhance its quality.The transform domain-based compression is the most extensivelyemployed approach for lossy image compression that converts the pixels fromone domain into another domain to generate some coefficients, in which the characterization is verynormal and consequently more compressed image.This characterizationenablesa set of coefficients to provide the majority of the energy in the pictureswhereas others are probablyinfinitesimal or zero.It employs a revocable and linear transform to relate the picture into coefficients which are quantized and compressed subsequently.An ideal transformation relates theutmost information into only a few coefficients.In the quantization process, the coefficients that hold the minimum information are discarded.The transform coding decomposes an N N size of input picture into many n n nonoverlapping blocks (i.e., sub-images).Then, each block isindividually transformed to produce (N/n) 2 arrays with the size of n n.Generally, the correlation among the neighboring pixels in an image is maximum.Transform compression methodsexploits this correlation to realize a highercompression ratio.Itrealizesgreaterefficiency due to three aspects as given below: 1.This type ofcompression method isa block-based methodin which a block of image data is processedinstead ofonlyone element.
2. The quantization leads toeliminating the correlation between the pixels of each block.3.Only selected transformed coefficients are quantized and transferred to the destination side; accordingly, greater compression ratios can be obtained.
The application of the transformation-basedcompression methodposes more challenges than the deployment of the predictive compression method and therefore it is not desirable for applications that demand less complexity and cost [18].There are numerous transformation techniques found in the literature which are employed in image compression including KarhunenLoeve Transform (KLT) (also called PCA) [ The rudimentary idea of the hybrid compression approaches is to integrate spatial domain with transform domain methods to enhance the compression performance as well asthe quality of reconstructed images with reduced noise factor.One moreoption is implementing two transformation methods on given images.The integration of two methodsenableimprovedrepresentations of givenpictures, improved handling of curved shapes, and better-qualitycompressed images.The overall benefits of the hybrid methods are increasing the visual quality and reducing noisedata and artifacts in output images.
In this paper, we propose a new hybrid color image compression method that integrates PCA and DTT algorithms.Even though PCAprovides better CR, the quality of the reconstructed image is poor.Hence, the DTT algorithm is used with PCA to improve the quality of the output image as well as to achieve a more compact image.First, the input image is divided into blocks using PCA.Each sub-image is employed as a sample vector.PCA choosesa covariance matrix of biggersingular valuesequivalentto the Eigenvector to perform image compression.Then the dimension of the image is reducedfurtherby means of the DTT technique.
The systematic organization of the manuscript is given as follows: In Section II, we summarize the previous research works on image compression techniques.In Section III, we discuss our proposed hybrid color image compression technique in detail.In Section IV, we describe the implementation details of our work with experimental results.Finally, we conclude our paper in Section V.

Related Works
Of late, the image processingprofessionalshave been dynamically involved in the designing of compression models.With the advent of multimedia products, complicated issues have surfaced that needa deeper understanding and substantialinvestigation.Digital image compression, with their ability to establish valued insights for effective communication and storage system, have recently gained intensive attention from both researchers and academics.This section discusses some of the topical methods developed for image compression.PCA is a method to decrease the higher-dimensional space presented by mathematical functions [26,27].It extracts the basicfeatures of a linear system usingthe SVD technique [28].This technique has been extensivelyused to remove noise elements in image recognition, digital signal processing, solving classification problems, etc. [29 -33].Santo implemented PCA to reduce the dimension ofmedical images [34].Other PCAbased digital image compression methods can be found in [35 -39].
Vaish and Kumar proposed a new image compression technique using PCA and Huffman Coding [40].The dimension of the givenpicture is first reducedby PCA, a limited number of the principal components (PCs) are employed to reproduce the output image, whereas the other less important PCs are discarded.The reproduced image is further quantized to decrease contouring, befellowing to a smaller number of PCs are employedfor image reproduction.Finally, the Huffman coding is used to eliminate coding redundancy in the quantized image.Thismethod is tested on different images and results are compared with JPEG2000.Visual results and comparative analysis reveal that the proposed methodoutperforms the JPEG2000.
Yadav and Nagmode proposed a hybrid image compression method using PCA and DCT (i.e., PCA-DCT), in which PCA is used to calculate the dissimilarities, similarities, and feature vector in the form of the residual picture.Then, DCT is implemented to reduce the dimension of the givenpicture [41].In order to increase the compression efficiency further, Mei et al. developed anintegratedtechniquecalled folded-PCA in combination with JPEG2000 [42].This method calculates the covariance matrix by folding the spectral vector into a matrix.Then, the Eigenvectors are employed to find PCs that can characterize features of the whole picture.JPEG2000 is used to reduce the dimension of the picture further.
Xiao et al. presented an image compression method using DTT and matrix factorization theory.In this work, the  × transform matrix was factorized into  + 1 single-row elementary reversible matrices with minimum rounding errors [43].Actually, for efficient lossless compression, integer DTT (iDTT) was developed to realize integer to integer mapping.Furthermore, a series of investigationsarecarried out, and the resultsconfirmed that the applied iDTT method not only provides a greaterCR than the iDCT method, however it was compatible with the widelyused JPEG standard.Kishk et al. introduced an integral image compression method that combines DWT and PCA.This method isbased on implementing PCA on the wavelet coefficients of the 3D input images to enhance the quality of the compressed images while attaining higherCR.The wavelet coefficients of the specific image are weighted and reordered before implementingthe PCA algorithm.The PCA is applied to each sub-band separately to improve the CR.The quality of the compressed 3D images and giveninput images are measured [44].Senapati et al. developed a DTT-based hybrid method, where DTT is combined with SPIHT to compress images [45].Also, suitable perceptual weights are used to enhance the quality of the compressed image.The authors compare the system performance withseveral state-of-the-art compression schemes.Nevertheless, the abovementioned approaches cannot provide a finite resolution and need asubstantial amount of processing time to compress the given image.

Proposed System
In this section, we propose a hybrid color image compression approach based on PCA -DTT algorithms, which integrates the benefits of both PCA and DTT.The proposed method is implemented in order to realizeimproved compression performance and to get better-quality reconstructed images.The number of features extracted from DTT is relatively high, which can be reduced to a manageable low-dimensional space by eliminating the irrelevant features in the given image using PCA.

Principal Component Analysis
PCA is traditionally used as a dimension reduction technique and is a useful tool to visualize highdimensional data in a manageable low-dimensional space [46,47].It exploits an orthogonal transformation to convert a set of  (perhaps correlated) perceived variables (i.e., features) into another set of  uncorrelated features (i.e., PCs).The principal components are uncorrelated linear functions of the initiallyperceived features that consecutivelyexploit variance such that the first PC implies that the axis along which the observed data show the maximum variance; the second PC implies that the axis that is orthogonal to the first PC and along which the perceived data show the second largest variance; the third PC implies that the axis that is orthogonal to the first two PCs and along which the perceived data show the third-largest variance, and so forth.Thus, orthogonal dimensions of data variability are captured in  PCs and the amount of variability that each PC accounts for gatheredentiredata variation.The key objective of PCA is to capture as much variation as possible in the first few PCs.It is, therefore, often the case that the first ( ≪ ) PCs holdpossiblyvaluable information in the perceived data, and the rest hold variation mostly due to noise [27].
More precisely, let represent a real-valued observation of the jth feature made on the ith subject, where = 1,2 … .,  and = 1,2 …. , .Assume that the n observations are organized in n rows of a  × data matrix  with columns related to  features.We normalize columns of  to have zero mean and unit standard deviation and save the resultant values in a data matrix , that is, the elements  of  are calculated by Equation ( 1).
( − ̅ )  = (1)  where ̅  and  are the mean and standard deviation of the jth column of , correspondingly.The PCA can be carried outusingthe SVD method of ; that is, the  ×  matrix  of rank ≤ min (, ) is decomposed asgiven in Equation ( 2).

𝐴 = 𝐾𝔇𝐿 𝑇
(2 where  is an × orthonormal matrix (   = ),  is an ×  diagonal matrix comprising non-negative singular values in descending order of magnitude on the diagonal and  is a matrix with orthonormal columns (   = ).The sample correlation matrix of  by can be expressed as follows where  is an ×  diagonal matrix encompassing non-zero positivesingular values (i.e., eigenvalues) = (1,2,… .)  of matrix  on the diagonal in descending order of magnitude.It follows that the columns of matrix comprise the eigenvectors of   and hence are the desired directions of variation.The derived set of  PCs are calculated by It is noteworthy to mention that the matrix comprisesnormalized PCs in its columns and is a scaled version of , which is provided additionally in Equation (1).To see this, multiply Equation (1) on the right by  to obtain Equation ( 5).
In practice, the first  ≪  major PCs are of much interest since they representthe majority of the data variation.Without unnecessary loss of information, the size of  may thus be reduced from  to , that is where ̅ is a  × matrix that comprises the first  columns of  and ̅ comprises the first  PCs in its columns.The set of first  PCs is a lower-dimensional characterization of a -dimensional dataset and can be applied to expose trends and patterns in the data, which is possibly the most predominant application of PCA.Additionally, the most useful PCs can be employed in furtherdata analyses.From original features, the  PCs (although useful because they represent most of the data variation) are difficult to infer [48], and frequently misleading or imaginary interpretations are happened [27].The noise levelin datasets owing to highdimensionality often concealsvaluableinterpretable patterns [49].Generally, the PCs are generated in an unsupervised method so that may not be optimum for further studieslike regression.In this case, it is assumed that only a limited number of features canrevealreproducible patterns in the perceived data and feature selection (instead of extraction) techniques are generallyselected.
A lower rank approximation of  can be calculated by Equation ( 7).
which is the best approximation of  in the least-squares sense by a matrix of rank  [50,51].The value of  is selectedby means ofexisting heuristic options including a plot of Eigenvalues in descending order of magnitude, called a scree-plot, which selectstheleast value of  for which the scree plotsurpasses a predefinedthreshold [0,1].Generally, the value of  ranges from 0.7 to 0.9 [27].A value of  = 0.85implies that the selectedPCs describe at least 85% of the cumulative variance in the perceived data.PCA executes the following steps to compress an input image: 1. Determine the vector from the input image matrix.2. Calculate the covariance matrix.
3. Singular values and Eigenvectors are calculated by solving the characteristics equation.Each Eigenvector should be standardized.
4. From the standardized Eigenvectors, the transformation matrix is constructed.5. Derive the transform of the given image.
6. Input values are reproduced from the transformed coefficients.7. Reduce the dimensionality of the input image matrix.

Discrete Tchebichef transform
DTTalgorithm has been widely used to enhance the reconstruction quality of the conventionalimage compression methods.This workimplements an effective lossy compression technique based on DTT to yielda better-quality reproducedpicture for the expectedCR.The DTT is a new transform that exploits the Tchebichef moments to derive a basis matrix.DTT is derived from the orthonormal Tchebichef polynomials [52].For an image of dimension × , the forward DTT of order  +  is defined as where ,  = 0,1,… .− 1.The inverse transform of DTT is defined by −1 −1 (, ) = ∑ ∑ ()() (9) =0 =0 where ,  = 0,1,…. − 1.From ( 8) and ( 9), () and () are rth-and sth-order Tchebichef polynomials, correspondingly.Generally, qth-order Tchebichef polynomial is defined using the following recurrence relation as The initial values of () for = 0,1 is defined as Equation ( 16) can be expressed using a series representation involving matrices as follows where ,  = 0,1,…. − 1, and  is known as the basis matrix.The basis matrix  can be defined as follows )( 0) ( 7)( 0) ( 0)( 1) ( 1)(1) …. ( 7)( 1) (0)( 7) (1)( 7)] (17) …. ( 7)( 7) Hence, the DTT of a square image as in ( 12) can be viewed as the projection of the image  on the basis image , which is the product of the vectors  and , where  = (0)( 1)…( − 1) and  = (0)( 1) …( − 1).Equation ( 18) can be written as In other words, Tchebichef transform  estimates the correlation between the image  and basis image.It stores a high positive value if there is a strong similarity between them.It shows that when the order of the transform is increased, the basis images are changed from low spatial frequency to high spatial frequency.This proves that there will be neither large variation in the dynamic range of transformed values nor numerical instabilities that occur for large values of D.

Hybrid color image compression approach
In the proposed approach, a hybridtechniqueisdevelopedthatintegrates the PCA and DTT techniques to compress color images.The block diagram representation of the proposed method is depicted in Figure 1. 1. First, the dimension of the input picture is reducedby PCA.It relatesthe majority of the image data in the first few PCs itself.Dimensionality reduction is carried outusingthe first few PCs of the givenpicture.Subsequently, the picture can be reproducedwiththe first few coefficients.PCA is considered an idealcompressiontechniquein terms of its energy compaction feature.Even though PCA provides better compression ratio, the other quality parameters including PSNR, SSIM, and UQI values of the reconstructed image are poor.Accordingly, these quality metrics can be improvedusingthe DTT algorithm.
2. Inthe second phase, DTT is used for image compression which providesimproved PSNR, SSIM, and UQI values as compared to PCA.Accordingly, the quality of the reproducedpicturegained from the PCA can be further enhancedusingthe DTT method.In this phase, DTT is employed to the output image obtained from PCA to generate transformed coefficients.Subsequently, the compression can be performed through the transformed coefficient values using DTT.

Results and Discussion
In this section, we demonstrate the efficiency of ourcompression methodusing the results obtained from the experimentation.ThePCA-DTT techniqueisimplementedand evaluated using MATLAB 2017a running in a system with a Core-i7 processor, 32 GB RAM, and Windows 10 operating system.Different images including animals, human faces, historical images, objects, vegetables with different formats (i.e., bmp, png, tiff, and jpg) and different sizes (i.e., 512×512, 256×256, and 120×80) are used as inputs.For all images, we use a block size of 8×8.The quality of the reproduced images isassessed by calculating the PSNR, SSIM, and UQI values.The performance of the PCA-DTT systemisevaluatedwith respect toCR and CT.

Compression Performance Assessment
The performance of image compression methods can be evaluated in many aspects.We assess the compression performance of ourmethodusing the metrics includingCR and CT.The compression ratio is defined as the proportion of the total number of bits essential to save theinputpicture and the total number of bits essential to save thereproduced picture.
The compression time is the time taken by the proposed technique to compress an image.This quality parameter demonstrates the computational complexity and the speed of the compression process.

Image Quality Assessment
In order to enumerate the visually observedvariances between the input and the reconstructed images, many quality metrics have been considered [53].The difference in the compressed image from the originalimage in lossy compression is called distortion.Other terms like quality and fidelity are also used to denote the variationbetweeninput and compressed image.Thewidely used measure for this purpose is PSNR.It is used to find the observed errors perceptibleto the human vision.This ratio is usuallycalculated in terms of thelogarithmic decibel scale (dB).It is defined as the proportion between the peaksignal power and the humiliating noise power that distresses the fidelity of its characterization.The higher value of PSNR indicates the improved quality of the reproduced image.The PSNR is calculated as follows

𝑀𝑆𝐸
The value of SSIM determines the resemblance between two pictures and can be considered as the quality of one of the pictures being related when the other one is taken as the reference [54].The rudimentarytenet of measuring SSIM is that the structural distortion can be related to observedpicture quality and is measured as: where o and rrepresent the original and the reconstructedpictures,and are mean values of the luminance in the original and reconstructed picture correspondingly.The parameters and  are the equivalent standard deviations of the luminance, and is the cross-covariance.The factors and are the contrast values of the original and reconstructedpictures respectively.The value of SSIM can range from -1 to +1; SSIM = +1 is realized only when the reconstructed image is identical to the original image.SSIM does not considerthe attributes of the human visual system but was nevertheless shown to be sturdilyassociated with the individualpicture quality metrics.Estimation of SSIM also creates a map, which givesthe value ofpicture quality over space thus making it possible to relatevarious regions in the pictures and to perceive their resemblances.
Anotherimportant quality measure to evaluate compression algorithm is known as theUQI [55].It is calculated by modeling thepicture distortion in terms ofluminance distortion, contrast distortion, and loss of correlation.The value of UQI for each sub-image can be estimated as: ) where and are the mean value of the original and the reconstructed pictures.The parameters and  are the standard deviations of the original and reconstructed pictures and  is the covariance.

Implementation Of DTT
The compression efficiency gained from the PCA is not found to be reasonable.Therefore, the quality of the compressed picture gained from the PCA can be further enhanced by the DTT technique thatprovidessuperior PSNR value.DTT is applied to the picture reproduced by PCA in order to obtain the transformed coefficients.
Subsequently, the compression can be performedby applying the transformed coefficient values.

Implementation Of Hybrid Color Image Compression Technique
The PCA-DTT method is evaluated with different pictures as shown in Figure 4.The reproduced output pictures are displayed in Figure 5.The results gained from our PCA-DTT technique in terms of performance metrics are displayed in Figures 6 to 10. From the graphs, it is found that the proposed method achieved improved performance as well as quality metrics.More precisely, it can be observed that the PCA-DTT method providesan improved compression ratio, which is superior to that of the DTT-SVD.The average CRgained by the PCA-DTT method and DTT-SVD is 8.20 and 7.30respectively.Therefore, PCA-DTT can be considered asabettertechnique.The quality of output images obtained from the PCA-DTT technique has also been evaluated and compared in terms of PSNR with a similar statistical setup.It is observed that the proposed method outperforms the DTT-SVD technique with respect to PSNR.The proposed methodattainsa higher average PSNR (33.43) than DTT-SVD (32.99).From a comprehensivestudy of the output images obtained, it can be concluded that the proposed methodprovidesbetter results over DTT-SVD with respect to both SSIM and UQI.The average SSIM achieved by the PCA-DTT method and DTT-DVD is 0.97 and 0.91correspondingly.Also, the average UQI achieved by the PCA-DTT method and DTT-DVD is 0.84 and 0.68correspondingly.

Conclusion
The increasing applications of digital imaging in various scientific and technological domains make it more difficult to process huge volumes of multimedia products.The processing, transmission, and storage of digital images in their original form are very costly.Hence, data compression has become an inevitable part of image processing towards the reduction of the transmission time, communication bandwidth, and storage space requirements of images.This paper presents a lossy compression method, which integrates PCA and DTT algorithms.This approach employs the PCA technique to reduce the dimensionality of the image and the DTT algorithm to enhance the quality of the reconstructed image.The proposed approach is implemented using MATLAB and its sturdiness is verified against the DTT-SVD image compression algorithm as per quantifying the measuring factors such as CR, CT, PSNR, SSIM, and UQI.The proposed method overcomes the restrictions about image representation confronted by the prior compression methods.The experimental results reveal that our PCA-DTT method outdoes the existing DTT-SVD method for different types of image contents at a high CR with lower computational complexity and retaining the quality of input digital image.

Figure 1 :
Figure 1: Overview of proposed hybrid image compression method OurPCA-DTT techniqueimplemented in two phases as follows:

Figure 2 Figure 2 :
Figure 2: Input and output images of the PCA algorithm Figure 3 showsthe screen plot (i.e., the variances of the extracted PCs as a function of the component index[27]).In this plot, the eigenvalues are sorted from larger to smaller.This will have significantinferences, in selecting a subset of PCs to find the representative behavior of the image data.The graph levels of after PC5, demonstrating that factors 6-8 representcomparativelyslight additional variance.Hence, fivePCs are considered.

Figure 3 :
Figure 3: Screen-plot of variance in PCA algorithm

Figure 3 :
Figure 3: Input and output images of the DTT algorithm

Figure 4 :
Figure 4: Input images used to evaluate PCA-DTT algorithmIn order to illustrate the effectiveness of the PCA-DTTtechnique over the DTT-SVD method, an extensive comparison in terms of different metrics, namely, CR, CT, PSNR, SSIM, and UQI arecarried out and the results are given in Table1.From a comprehensivestudy of the results observed, it can be stated that our

Figure 5 :
Figure 5: Output images obtained from PCA-DTT algorithm

Figure 7 :
Figure 7: Performance measures for human face image Additionally, to prove the superiority of the PCA-DTT technique in terms of computational complexity the compression time is considered.The average time taken for image compression by the PCA-DTT technique and DTT-DVD is 3.92 sec and 4.41 secrespectively.Hence, it is concluded that the computational complexity associated with PCA-DTT is less as compared to DTT-SVD.

Figure 6 :
Figure 6: Performance measures for Animal Images

Figure 8 :
Figure 8: Performance measures for historical image

Figure 9 :
Figure 9: Performance measures for object image

Table 1 .
From a comprehensivestudy of the results observed, it can be stated that our

Table 1 :
Comparison of performance measures between DTT-SVD and PCA-DTT