Loading [a11y]/accessibility-menu.js
A review of research on MapReduce scheduling algorithms in Hadoop | IEEE Conference Publication | IEEE Xplore

A review of research on MapReduce scheduling algorithms in Hadoop


Abstract:

Big data has created an era of tera where bulk volume of data is being collected at escalating rates. Due to increase in storage capacities, processing power and availabi...Show More

Abstract:

Big data has created an era of tera where bulk volume of data is being collected at escalating rates. Due to increase in storage capacities, processing power and availability of data, the size of global data is growing in zeta-bytes. Hadoop is one of the technologies in the big data landscape for analyzing the data through Hadoop Distributed File System and Map-Reduce. Job scheduling is an important activity for efficient management of cluster resources. Hadoop schedulers are pluggable components which assign resources to jobs. In a variety of schedulers, prominent are the default FIFO, Fair and Capacity schedulers. In this paper, a comprehensive survey of the various job scheduling algorithms has been performed. Also their comparative parametric analysis has been carried out by emphasizing the common key points in these schedulers.
Date of Conference: 15-16 May 2015
Date Added to IEEE Xplore: 06 July 2015
ISBN Information:
Conference Location: Greater Noida, India
Citations are not available for this document.

I. Introduction

Big data [1] refers to a massive collection of large amount of data whose processing depends upon open-source frameworks like Hadoop and MapReduce. It cannot be processed using traditional data-processing tools like relational databases and Structured Query Language. Specifically Big Data refers to the creation, storage, retrieval and analysis of data in terms of five V's viz. volume, velocity, variety, veracity and value. According to a report [2], Facebook processes more than 500TB of data daily. Many other similar reports on big data statistics [3] throw light over the challenges of big data.

Cites in Papers - |

Cites in Papers - IEEE (3)

Select All
1.
Karim Hadjar, Ahmed Jedidi, "A New Approach for Scheduling Tasks and/or Jobs in Big Data Cluster", 2019 4th MEC International Conference on Big Data and Smart City (ICBDSC), pp.1-4, 2019.
2.
Haythem Yahyaoui, Samir Moalla, "CloudFC: Files Clustering for Storage Space Optimization in Clouds", 2016 IEEE International Conference on Cloud Computing Technology and Science (CloudCom), pp.193-197, 2016.
3.
Tomohiro Matsuno, Bijoy Chand Chatterjee, Eiji Oki, Malathi Veeraraghavan, Satoru Okamoto, Naoaki Yamanaka, "Task allocation scheme based on computational and network resources for heterogeneous Hadoop clusters", 2016 IEEE 17th International Conference on High Performance Switching and Routing (HPSR), pp.200-205, 2016.

Cites in Papers - Other Publishers (5)

1.
Soudabeh Hedayati, Neda Maleki, Tobias Olsson, Fredrik Ahlgren, Mahdi Seyednezhad, Kamal Berahmand, "MapReduce scheduling algorithms in Hadoop: a systematic study", Journal of Cloud Computing, vol.12, no.1, 2023.
2.
Arif Ahmad Shehloo, Muheet Ahmed Butt, "A Queue Management System for Cloud Data Processing", Computer Communication, Networking and IoT, vol.459, pp.55, 2023.
3.
Sandhya Waghere, P. RajaRajeswari, Vithya Ganesan, ICCCE 2020, vol.698, pp.1603, 2021.
4.
B. S. Vidhyasagar, J. Raja Paul Perinbam, M. Krishnamurthy, J. Arunnehru, "A Cost-Effective Data Node Management Scheme for Hadoop Clusters in Cloud Environment", Machine Learning and Metaheuristics Algorithms, and Applications, vol.1203, pp.27, 2020.
5.
Yoon-Su Jeong, Seung-Soo Shin, Kun-Hee Han, "High-dimentional data authentication protocol based on hash chain for Hadoop systems", Cluster Computing, vol.19, no.1, pp.475, 2016.
Contact IEEE to Subscribe

References

References is not available for this document.