Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Apache Oozie Essentials

You're reading from  Apache Oozie Essentials

Product type Book
Published in Dec 2015
Publisher
ISBN-13 9781785880384
Pages 164 pages
Edition 1st Edition
Languages
Author (1):
Jagat Singh Jagat Singh
Profile icon Jagat Singh

Pig action


Let's see the Pig script that will help us calculate the maximum rainfall in each month.

I have saved the input data for this chapter in the input folder placed at BOOK_CODE_HOME/learn_oozie/ch05.

If you have already copied the source code for this folder on HDFS at the start of chapter, then it will automatically go to the right place inside HDFS. If not, you can copy the code to HDFS now.

The input data is comma separated and the columns in the data are as follows:

  • Product code

  • Bureau of Meteorology station number

  • Year, Month, Day

  • Rainfall amount (millimeter's)

  • Period over which rainfall was measured (days)

  • Quality

We will write the Pig script and load the raw input data, which is grouped by year and month. Then, we will calculate maximum rainfall for each month.

The following Pig script is present at the path BOOK_CODE_HOME/learn_oozie/ch05/rainfall/pig:

# Pig Script to find Max rain in given month
A = load '${pig_input}' using PigStorage(',') as (product_code:chararray,station_number...
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}