What is MapReduce? How it Works - Hadoop MapReduce Tutorial

danielrodrigues
Created by danielrodrigues Jun 29, 2020

What is MapReduce? How it Works - Hadoop MapReduce Tutorial

What is MapReduce?
MAPREDUCE is a software framework and programming model used for processing huge amounts of big data. MapReduce program work in two phases, namely, Map and Reduce. Map tasks deal with splitting and mapping of data while Reduce tasks shuffle and reduce the data.
Hadoop is capable of running MapReduce programs written in various languages: Java, Ruby, Python, and C++. MapReduce programs are parallel in nature, thus are very useful for performing large-scale data analysis using multiple machines in the cluster.
The input to each phase is key-value pairs. In addition, every programmer needs to specify two functions: map function and reduce function.
How MapReduce Works? Complete Process
The whole process goes through four phases of execution namely, splitting, mapping, shuffling, and reducing.
Readmore: Hadoop Pig Tutorial: What is, Architecture, Example