Knowledge Graphs Using Cloud Services

Main Article Content

Pruthvi Raj Venkatesh, et. al.


Typically, industries store varied types of data in both structured and unstructured formats. This data is very vast and valuable as it is collected over many years. This data is present in multiple data sources, instances and is generated using expensive business processes. Though some of this data is old, it is still relevant in the present context as it provides valuable leads for the current ongoing study. One of the potential problems in any industry is to identify and extract knowledge from these varied data sources as they are: (1) geologically spread, (2) extracted from diverse systems categorized as structured or unstructured data (3) incomplete knowledge of data present and (4) high retrieval time and cost. Even though there are RDBMS databases for storing structured data and document management systems like SharePoint to organize and search unstructured data, there is a need for an efficient system that can link relational databases along with the knowledge present in unstructured data to produce a single knowledge repository and show in the form of a knowledge graph(KG). The knowledge graph should also be supplemented with additional functionalities like search, knowledge extraction, storage, and maintenance. This paper proposes a novel cloud-based approach to generate a knowledge graph by indexing structured and unstructured data and creating a single knowledge graph. We have provided details about implementation approaches in two popular cloud providers, namely Azure and AWS. We have implemented the approach on the dataset provided by the Bureau of Safety and Environmental Enforcement (BSEE) [4], which belongs to the oil and gas domain. The concepts detailed in the paper can be implemented using other cloud providers and can be extended to other industries. 

Article Details