Advantages Of Mapreduce

1170 Words5 Pages
Abstract: I2MAPREDUCE: FINE-GRAIN INCREMENTAL PROCESSING IN BIG DATA MINING is the extension of Map reduce technique it improves the stale and obsolete data mining application results as the new data and updates are arrives. Incremental processing gives refreshing mining results I2MapReduce has its own advantages (i) It prefers key-value pair level incremental processing to perform instead of task level re-computation, (ii) It supports one-step computation along with more sophisticated iterative computation, which is extensively used in data mining applications, and (iii) It reduce I/O overhead for accessing preserved fine-grain computation states by incorporating the set of novel techniques.
…show more content…
While MapReduce is used in many areas where massive data analysis is required, there are still debates on its performance, efficiency per node, and simple abstraction. This survey intends to assist the database and open source communities in understanding various technical aspects of the MapReduce framework. In this survey, we characterize the MapReduce framework and discuss its inherent pros and cons.

MapReduce: Simplified Data Processing on Large Clusters
MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model

A Model of Computation for
…show more content…
The technical challenges in dealing with the increasing demand to handle vast quantities of data is daunting and on the rise. One of the recent processing models with a more efficient and intuitive solution to rapidly process large amount of data in parallel is called MapReduce. It is a framework defining a template approach of programming to perform large-scale data computation on clusters of machines in a cloud computing environment. MapReduce provides automatic parallelization and distribution of computation based on several processors. It hides the complexity of writing parallel and distributed programming code. This paper provides a comprehensive systematic review and analysis of large-scale dataset processing and dataset handling challenges and requirements in a cloud computing environment by using the MapReduce framework and its open-source implementation

More about Advantages Of Mapreduce

Open Document