Real-Time Analytics with Hadoop: Integrating Streaming Engines for Performance Gains

Main Article Content

Harsha Vardhan Reddy Goli

Abstract

The rising demand for real-time data analytics in domains such as the Internet of Things (IoT) and telecommunications necessitates hybrid big data architectures that seamlessly combine batch and stream processing. This study investigates the integration of Hadoop with real-time streaming engines, specifically Apache Storm and Apache Flink, to address the challenges of low-latency analytics within traditional big data frameworks. We analyze performance tradeoffs, latency mitigation techniques, and fault tolerance mechanisms involved in such hybrid deployments. Through benchmarking and architectural evaluation, the research identifies key design considerations, including pipeline optimization and efficient resource management strategies that support concurrent batch and real-time workloads. Empirical insights from IoT and telecom use cases illustrate the effectiveness of integrating Hadoop’s scalable storage with the high-throughput, low-latency processing capabilities of modern stream engines. The findings affirm the practicality and performance benefits of adopting a unified analytics ecosystem for real-time data-driven decision-making.

Downloads

Download data is not yet available.

Article Details

How to Cite
Vardhan Reddy Goli, H. (2020). Real-Time Analytics with Hadoop: Integrating Streaming Engines for Performance Gains. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 11(2), 1347–1358. https://doi.org/10.61841/turcomat.v11i2.15250
Section
Articles

References

Zaharia, M., Chowdhury, M., Das, T., Dave, A., & Shenker, S. (2010). Resilient Distributed

Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. Proceedings of

the 9th USENIX conference on Networked Systems Design and Implementation (NSDI’10),

(1), 15–28.

Soni, M., & Chhajed, S. (2014). Hadoop in Action: Real-Time Analytics with Apache

Hadoop. Packt Publishing.

Kim, B., Lee, S., & Kim, Y. (2013). Real-Time Stream Processing with Apache Storm and

Hadoop. Proceedings of the International Conference on Cloud Computing and Big Data.

Kreps, J., Narkhede, N., & Rao, J. (2011). Kafka: A Distributed Messaging System for Log

Processing. Proceedings of the 6th International Workshop on Networking Meets

Databases.

Davy, M., & Wang, X. (2014). A Study of Apache Flink for Big Data Streaming Analytics.

Proceedings of the International Conference on Big Data Computing and

Communications.

Agarwal, R., & Agrawal, R. (2016). Streaming Analytics with Apache Flink: A New

Approach for Processing Data Streams. IEEE Transactions on Big Data, 2(1), 15-20.

Gubbi, J., Buyya, R., Marusic, S., & Palaniswami, M. (2013). Internet of Things (IoT): A

Vision, Architectural Elements, and Future Directions. Future Generation Computer

Systems, 29(7), 1645–1660.

Meng, X., Bradley, J., Yavuz, B., & Liu, S. (2016). Mllib: Scalable Machine Learning on

Apache Spark. Proceedings of the 23rd ACM SIGKDD International Conference on

Knowledge Discovery and Data Mining.

White, T. (2012). Hadoop: The Definitive Guide. O’Reilly Media.

Dastgheibi, S. A., & Fox, A. (2014). Real-Time Big Data Stream Processing with

Apache Kafka. Proceedings of the International Workshop on Big Data.

Soni, S., & Rani, R. (2017). Real-Time Data Stream Analytics Using Apache Flink: A

Survey. International Journal of Computer Applications, 167(6), 1-7.

Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified Data Processing on Large

Clusters. Proceedings of the 6th USENIX Symposium on Operating Systems Design and

Implementation (OSDI’04).

Zhang, Z., & Zhang, L. (2015). Performance Analysis of Apache Hadoop and Apache

Spark for Big Data Processing. Proceedings of the International Conference on Data

Mining and Big Data.

Huang, X., & Cao, Y. (2017). Design and Optimization of Big Data Real-Time

Processing System Based on Hadoop and Apache Storm. International Journal of

Computer Science and Network Security, 17(4), 69-75.

Li, Y., & Liu, Y. (2016). A Comparative Study of Real-Time Stream Processing

Frameworks: Apache Storm and Apache Flink. Proceedings of the International

Conference on Computational Intelligence and Communication Networks.

Ucar, N., & Yildirim, E. (2019). Performance Evaluation of Stream Processing

Frameworks for Big Data Analytics. Future Generation Computer Systems, 89, 20-30.

Gajbhiye, S., & Apte, M. (2018). Real-Time Big Data Processing and Analytics: A

Case Study of IoT in Smart City. Proceedings of the 2nd International Conference on

Cloud Computing and Data Science.

Hasan, S. S., & Zulkernine, M. (2017). Performance Evaluation of Streaming Analytics

Systems: A Survey of Apache Storm, Spark Streaming, and Flink. Proceedings of the

International Conference on Cloud Computing and Data Science.

Dong, M., & Liu, Q. (2019). Efficient Data Stream Processing and Its Applications in

IoT. International Journal of Computing and Digital Systems, 8(1), 23-30.

Pal, S., & Kundu, M. (2015). Real-Time Data Processing in Hadoop Using Apache

Flink. Proceedings of the International Conference on Big Data.

Ekanayake, J., & Pallickara, S. (2011). Real-Time Stream Processing with Apache

Storm. Proceedings of the International Conference on Cloud Computing Technology and

Science (CloudCom), 148-155.

Milani, M., & Triani, F. (2018). Real-Time Big Data Processing with Apache Flink: A

Comparative Study. Computers & Electrical Engineering, 68, 775-782.

Basu, A., & Soni, M. (2017). A Review on Real-Time Big Data Stream Processing with

Apache Kafka and Apache Storm. International Journal of Computer Applications, 160(5),

-31.

Chaudhary, A., & Agrawal, R. (2015). Integration of Hadoop with Real-Time Stream

Processing for Big Data Analytics. IEEE International Conference on Big Data (Big Data),

-240.

Yan, Z., & Liu, Y. (2016). Real-Time Big Data Analytics with Apache Flink and

Hadoop. Journal of Software Engineering and Applications, 9(6), 384-390.

Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.