Chapter 5. Running Pig Jobs
In this chapter, we will see how to run Pig jobs from Oozie. Pig is a general-purpose data flow language, which makes running and doing ETL on Hadoop very easy. If you are new to Pig, then I suggest you to check out the tutorial on the Pig website (http://pig.apache.org/docs).
In this chapter, we will:
Create Oozie Workflows for Pig actions
Run Pig jobs from Coordinators
From the concept point of view, we will:
Understand the concept of parameterization of Dataset instances
Understand the concept of Coordinator controls
Understand the concept of
config-defaut.xm
l