Search icon
Subscription
0
Cart icon
Close icon
You have no products in your basket yet
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Apache Oozie Essentials

You're reading from  Apache Oozie Essentials

Product type Book
Published in Dec 2015
Publisher
ISBN-13 9781785880384
Pages 164 pages
Edition 1st Edition
Languages
Author (1):
Jagat Singh Jagat Singh
Profile icon Jagat Singh

Running MapReduce jobs from Oozie


We will see how to write a simple MapReduce job for word count and schedule it via Oozie. Later, we will wrap this in our first Coordinator job. Along this journey, we will learn some concepts and apply them in examples.

I have already saved one word count Java MapReduce code, which we will try to run over our input data. Let's dive into the code. You can check out the mapreduce folder in Book_Code_Folder/learn_oozie/ch04/.

Note

Check the workflow_0.5.xsd file in the xsd_svg folder and note the inputs needed for the MapReduce action to run.

The Workflow is shown in the following code and we can see the arguments are the same as the one we need in the Hadoop jar command for running a MapReduce job. At the start of the job, we delete the output folder as Hadoop fails the job if the output folder already exists.

The mapper that we need is life.jugnu.learnoozie.ch04.WordCountMapper and the reducer is life.jugnu.learnoozie.ch04.WordCountReducer. Both of them are present...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}