Amazon EMR is based on Hadoop, a Java-based programming framework that supports the processing of large data sets in a distributed computing environment. MapReduce is a software framework that allows developers to write programs that process massive amounts of unstructured data in parallel across a distributed cluster of processors or stand-alone computers.
EC2 (Elastic Compute Cloud) and S3 (Simple Secure Storage) will also be employed in this lab.
Elastic Map Reduce:
Getting Started Tutorial: