In this chapter, we will learn how to run MapReduce jobs using Oozie. MapReduce jobs are of two types: Java MapReduce jobs and Streaming jobs. Streaming jobs are written in languages other than Java. We will also enter in to the world of when part of Workflow execution using Coordinators to schedule our jobs.
In this chapter, we will do the following:
Run Java MapReduce jobs from Oozie
Run Streaming jobs from Oozie
Run Coordinator jobs
From the concept point of view, we will:
Understand the concept of Coordinators
Understand the concept of cron-based frequency schedules
Understand the importance of timezone in Oozie
Understand the concept of Datasets