Symbiotic view of Provenance in Cyber Infrastrcuture and Information Security

Access control is one of the important elements in providing confidentiality to the secured data. Access specifiers helps us understand degree of rights given to the users in utilizing data records in a right manner. Tampering the records by unauthorized parties is a high concern in secure communication. Tamper detection plays an important role in trouble shooting an issue associated with network/ host intrusion scenario. The advances in computer technology has driven the contemporary world to focus on the World Wide Web for digital data and associated information. People across the globe rely on the internet for all the data from local information to distribution of personal data through heterogeneous networks. Technology and software tools has grown so broad to an extent, where almost all of the financial transactions are taking place through online portals. On the other hand, there has been high growth in the security coercions towards user’s sensitive data. This information is however shared by the online users while performing financial transaction in e-commerce portals. In order to maintain security mechanism over the untrusted networks various authentication techniques available in this regard. All these security procedures are said to be stubborn and adequate on contextual basis, on the other hand over a period of time the intruders find out means to break into systems. Data theft and intrusion into the information systems would increase on a daily basis if defensive measures are not in place. We integrate concepts of secret sharing and data provenance to provide an indigenous solution for parameters of information security namely confidentiality, integrity and availability. Keywords—Data Provenance, Secret Sharing, Information Security, Access control.


Introduction
When data is transmitted between a point to another point or point to multi point, it can be visualized as data communication scenario. Sender, receiver, protocol followed, communication channel and packets of data are said to be components of communication. In this context, data or message is transmitted through a communication channel in an encrypted fashion for providing security for the same. At the receiver end, the data or message is decrypted and processed further. Fig. 1 demonstrates the concept of communication channels in the networks. One is a high-risk channel with less safety aspects and another is a low-risk system, coupled with good security features [1]. Along with the communication channels with their risk factors, source and destination aspects are also depicted in the above illustration. The low risky communication station is denoted with X1 Y1 Z1 and high risky communication station is with X2 Y2 Z2. The naming convention of the channels depicts the parameters with regards to fundamental security aspects namely confidentiality, integrity and availability. It is with the availability of exceptional wireless internet access in mobile motivated situations, customers and their data has turned out to be massive with respect to media. For example, financial related operations carried out over online platforms by users in many ways were found insecure and unauthenticated.  Procedure with appropriate algorithms are available for safe data communication in various modes, however lacks to attain high accuracy and performance with regards to the fundamental objectives of information security (the CIA traid) at a significant extent. Security is the main aspect of any communications among untrusted networks in the current world [2]. Sincere gratitude to many researchers for their tremendous contributions to effective security algorithms despite various threats that compromise the computer systems vulnerabilities. The source of the information, i.e., by which the online operation was created, is the pertinent query to be countered while the transaction is finalized. This definition of 'data antiquity' has received decent interest from investigators in different fields for many decades and is often termed as data provenance [3][4][5]. However, security in provenance has made some progress with recent research, particularly in the field of cyber security. The below sections explains about the aspects of data provenance and visual encryption followed by literature analysis with results and discussions.

Literature related to Data Provenance and VisuaL Encryption
Description and representation about the genesis, lineage and pedigree of an object is mentioned as provenance. With respect to data object and its associations the same is characterized as data provenance. Provenance data is delicate and a slight disparity tips to alteration in the complete sequence. This creation desired to be safeguarded and admission should be approved for sanctioned party. Data provenance provides a record of lineage with regards to transactions, events, processes and systems that influence the data of concern. This kind of record or lineage data provides better understanding on aspects related to data dependency and associated relationships. A multiple entity access control is showcased using secret sharing mechanism. Secret sharing is connected via visual encryption process. However these security mechanism are deployed for safe guarding sensitive genesis data called as provenance data. The proposed work sheds knowledge on a background allied to safeguarding provenance through secret sharing security concept.

A. Literature aspects of Data Provenance
Data provenance is the record or collection of events and transitions that a data item has experienced in its life cycle. Provenance information is often called as genesis data and is sensitive information. A small modification made to genesis data changes the entire lineage concerning to its variables and functions. A detailed literature is shown in this sub section on data provenance and security.
Visual cryptography is a strong development in the research area of information security, which allowed the encryption methodology with little mathematical calculations. The visual cryptography concept of shares [6] is however the heart of the design. In a few instances, the data transmitted using this form were also targeted at the attackers end. The public key encryption for the created image shares is embedded to render the method robust [7]. With time, the importance and relevance of video streaming data decreases, and this decline is also rightly proportional to the reliability of the multi-media data. The Burrow Wheeler Encoding Algorithm [8] was based on images with the suggested conditional transposition technique for short-period visual safety, taking into account graphical encoding of large-scale data and security unification. Analyzing the creation and sources of the genuine data has become an vital aspect of latest study following developments in the media with the public exchanging data on the internet. It is the term, protection that is catching the interest of common internet users. In an untrusted contexts, consumers are more concerned about the security of their knowledge that they exchange [9]. Establishing a structured arrangement for authentication of data provenance. Developed a basic paradigm that describes provenance and correlates it with the essential consequences for security fundamentals namely integrity, confidentiality and availability.
McDaniel, P. stated in his paper [10] the enormous amounts of information from both inner procedures and the far-flung, untrusted, unfamiliar foundations that societies and different users have provided. So, data provenance and information security are performed in a similar style. This information has to be polished to verify with any of the weaknesses allied. Provenance confirmation is also an significant activity that must be accomplished with data security susceptibility tests. Fig. 2 Illustrates the interaction between provenance, information security priorities and related fields of operation. With the arrival of the WWW (world wide web) and its popularity across the globe reaching enormous speeds, people have become very close to access data over the internet. Data streaming on images capture and audiovisual processing, particularly for the sensor systems, yields large amounts of information. As far as attribution is associated in this field, secure communication of provenance data is a challenging. There is an innovative method to data transmission [11]; which embeds provenance into the interpacket scheduling domain based with regards to sensor networks. The data receiver extracts provenance using an optimized threshold-based process which reduces the possibility of deciphering mistakes from provenance.
The "Chain-structure" provenance system [12] points out the protection features of the data explicitly in a hierarchical setting providing three-dimensional meta-data provenance.  Cloud storage is one of the latest technology developments, and its infrastructure and retrieval features have evolved to a larger extent. Organizations are expanding activities at various sites without caring about their expenditures in the network, with cloud computing and applications linked to virtualization. For the customers who pay for those services, protection for the same is an evolving problem. A novel technique for tracing entire lineage of data provenance on the descent of data history is available [13]. This uses data provenance algorithms based on rules to track customer data for cloud-based leakage risks.

B. Model of Provenance activities
Provenance basic understanding and its purest form of definition can be observed in W7 model [14]. Any information entity can be analyzed with seven Ws viz. Who, What, Why, Where, When, Which and How. Rationale for understanding issues and troubleshooting the same becomes easier with this model. A semantic repository is created and saved for all the data records with available parameters. Illustration of the afore mentioned model is generalized in Fig. 3 We are focusing on securing the provenance with a unique security mechanism in secret sharing.

C. Literature aspects of Visual Cryptography
Visual encryption is a cryptographic system using images to cover up the input message. In this technique deciphering will not need calculations. The input is reflected as group of black and white pixels. The input is broken down into two parts. Every share contains a series of black pixels and white sub pixels. Each share is printed distinctly, in which pixels are arranged. One is the cipher text published page and the other is the transparency in printed form. When the message is decoded, the encrypted image is put in a particular arrangement for getting the actual message as the text.

Horizontal shares
Vertical shares Diagonal shares Fig. 3 Category of shares in black and white pixels Fig. 3 illustrates an overview of the black and white pixel division of 4 sub pixels. The first element is horizontal, the second is upright and the third is crosswise. The

D. Application aspects of secret shares
To boost the picture effectiveness after implementation of visual cryptography the following approaches were suggested. They are termed as XOR constructed visual cryptography for GAS (General access structure) and adaptive section implementation using exclusive OR. In the conservative VC (visual cryptography), the procedures used OR for image rebuilding. These 2 techniques use exclusive OR based procedures in VC to deliver decent visual precision for the rebuilt image and safety to the transparencies of the secret image [15]. Lin et.al paper focuses on executing secret sharing through the combination of various twofold encryption techniques and visual cryptography methods [16]. It emphases on hiding information in image based. First, two shares information are designed through the pixels of the original data with a distribution matrix. Then a data encryption rule is used to interpret these two matrixes into an image which converts to a cover key. As per the authors, this technique is simple to use and is safer and does not require an input image for the decryption process..  [20]. This deals with the simple Visual Secret Exchange scheme using the traditional method of using 4 (2, 2) subpixel classes in Visual Cryptography. The definition of CAPTCHA is also used because only HVS (Human Visual System) will decode the stacked picture and not any machine source. Three types of images are considered here. Type I has twisted black and white contextual typescripts. Form II consists of warped and vivid characters. The background was detached in type III, which helped achieve an accuracy rate of 0.96. The above outcomes were showed up based on the comparison. Nitty et. al. listed the process of diffusion of errors in the visual cryptography of halftones [21]. Node error diffusion substitutes a pixel with a reference node in classical error diffusion so that coarse quantization errors are minimized. In a pixel row, the correct proportions of the quantization error at each pixel are diffused to the chosen pixels in the neighboring lines. A half-tone VC image share is divided into non-overlapping half-tone cells of the size q = v1 x v2. In each share a secret pixel of the image is encrypted into one halftone cell. Kumar et.al suggested a definition that aims to improve the already existing methods of visual cryptography [22].

Results and Discussion
At the core of the approach suggested is the preservation of the origin or genesis of the database. Fig. 5 demonstrate the same with three shares available at primary power, secondary power, third authority. A legal share checking process is in place before the access is provided for the concern [23]. Producing the shares for access control is performed by visual encryption mechanism. Keyword for retrieving the vital file is known by peculiar combination of shares. This is basically provided to the user for entry into proposed provenance based application. Timestamp data is collected with regards to users entry into the system during a particular period. Corresponding customer Id is also captured here. This captured information will be useful in cyber-attack investigations, troubleshooting, risk management etc. [24][25]. Fig. 6 illustrates the 3-D Scatter plot representation of OTP, Operations and Timestamp. Operations referred here are as follows.
• Delete • Alter • Update and • Add One time password (OTP) values across time stamp of users login is also depicted in the graph.  Scatter dot variations on Time parameter across Id of the customer with reference to reason mentioned for accessing application related to provenance data are illustrated in Fig. 9. Their variation can be observed in the above mentioned graphical representation. The recorded results helps in knowing the identity in accessing the system. It helps in troubleshooting issues in cyber-attack scenarios. Perhaps the tracking of this lineage corresponds to the definition of data provenance in the digital world. Graphical illustrations are simulated from SPSS statistical tool [26].

Conclusion
As a consequence of experimentation on the suggested approach, simulations reveal improved outcomes in CIA parameters by preserving provenance with secret sharing. This is compliant with secret sharing for access to the genesis of a specific archive. Confidentiality will be preserved by the permitting of access by multiple entities and, unlike other cryptographic mechanisms, is a exclusive aspect of secret sharing. This work provides an interpretation of the concept of a secret data protection on provenance aspects. Literature is described for both the approaches in a comprehensive and extensive way. Provenance has several solicitations in various fields. Computer Information Technology is one such area which is closely related in various software implementations and data repositories. Need for appropriate security and relevant access control mechanisms is required for provenance data. In this line, an effort is made to unite provenance with a typical security mechanism. This uses secret sharing concept for accessing a controlled genesis data.
Throughout specific cases both the principles visual encryption and the data provenance are demonstrated. Literature review is an illustration of the need to coordinate and connect these two verticals. Adequate formulations are given for the mathematical model for secret sharing mechanism. Appropriate findings are provided in developing the problem statement with the concepts of corresponding application.