Implementation of on-process aggregation for efficient big data processing in Hadoop MapReduce environment | IEEE Conference Publication | IEEE Xplore

Implementation of on-process aggregation for efficient big data processing in Hadoop MapReduce environment


Abstract:

The term Big Data, refers to sizably voluminous data whose volume, variability, and velocity make it very arduous to manage, process or analyzed. To analyze this sizably ...Show More

Abstract:

The term Big Data, refers to sizably voluminous data whose volume, variability, and velocity make it very arduous to manage, process or analyzed. To analyze this sizably voluminous kind of data Hadoop will be utilized. However, Processing is very time-consuming. To resolve this quandary & to decrement replication time one solution is to executing the job partially, where an approximate, early result becomes available to the utilizer, afore completion of job. Proposed system gives a more incipient MapReduce architecture that sanctions data to be divided for easier & early processing. This is not time consuming and amends system utilization for batch jobs as well. Proposed system presents a more incipient version of the Hadoop MapReduce framework that fortifies on-Process aggregation, which sanctions & avails users to get early results of a job as it is computing. It will evaluate this technique utilizing authentic-world datasets and applications and endeavor to amend the systems performance in terms of precision and time. Also the combiner introduced in this system is local reducer. Combiner will get execute after map function & before reducer. Instead of processing complete file on-process aggregation divides the file into number of blocks which helps to gives the result in slots. Dividing the file into number of data sets helps to give result as early as possible by giving intermediate result to the user. The objective of the proposed technique is to amend the performance of Hadoop MapReduce for efficient & easy Immensely Big Data Processing time.
Date of Conference: 26-27 August 2016
Date Added to IEEE Xplore: 26 January 2017
ISBN Information:
Conference Location: Coimbatore, India
No metrics found for this document.

I. Introduction

The Map Reduce On-process is a modified version of Hadoop Map Reduce, an open-source implementation of the Map Reduce programming model. It supports on-process Aggregation and stream processing, while also improving utilization and decreases response time. Proposed Map-Reduce implementations materialize the intermediate results of mappers. This approach has the advantage of simple recovery at time of failures, however, reducers cannot start execution of tasks before all mappers have finished. The main motivation of Map Reduce On process is to overcome these problems by introducing combiner, while preserving fault-tolerance guarantees. Although MapReduce was originally designed as a batch oriented system, it is often used for data analysis. User can submits a job to extract information from a data set, and then waits to view the results before doing next step of analysis process.

Usage
Select a Year
2024

View as

Total usage sinceJan 2017:272
00.20.40.60.811.2JanFebMarAprMayJunJulAugSepOctNovDec000100000000
Year Total:1
Data is updated monthly. Usage includes PDF downloads and HTML views.
Contact IEEE to Subscribe

References

References is not available for this document.