Reader small image

You're reading from  Apache Oozie Essentials

Product typeBook
Published inDec 2015
Reading LevelIntermediate
Publisher
ISBN-139781785880384
Edition1st Edition
Languages
Right arrow
Author (1)
Jagat Singh
Jagat Singh
author image
Jagat Singh

Contacted on 12/01/18 by Davis Anto
Read more about Jagat Singh

Right arrow

Chapter 9. Running Oozie in Production

In this chapter, we will see how to deploy Oozie code in production using best practices of continuous integration and deployment. We will also see how to make Oozie work in a secured Hadoop cluster. Besides this, we will discuss how to restart the Oozie jobs that have failed in between.

In this chapter, we will:

  • Create production-ready code for Oozie

From the concept point of view, we will:

  • Understand the concept of rerun

Packaging and continuous delivery


In this section, we will see how to package the Oozie code and deploy it in production.

The code for this section is available in the folder <BOOK_CODE_HOME>/learn_oozie/ch09/packaging.

Import the project in to your favorite editor (Eclipse/Intellij) as a Maven project.

The source code of Oozie gets deployed at two places:

  • On HDFS, where we copy all the Workflows, Coordinators, and so on.

  • On the local client machine from where we submit the jobs using the command line. All the job.properties files reside here.

If you see the code folder, we have a simple Maven project in which we have the following folder structure:

Maven project structure

We can see that the code that goes to HDFS has been written in the hdfs folder, and the code that has to be on the local client machine has been written in the client folder. Under both of them, we have a folder called apps. Under apps, we have different apps representing Oozie Workflows. I have copied one of the applications...

Oozie in secured cluster


A Hadoop cluster, which has been secured, needs some additional configuration for Oozie to work properly. The standard actions like Pig or MapReduce do not need any additional configuration from the Oozie side to run. However, when Oozie needs to talk to external services such as HBase, HCatalog, and Hive2 Server, we need to know how to authenticate them.

This is done by providing information about credentials for the security. Oozie has provided implementation for authentication for different external tools like Hive, HBase, and HCat.

In oozie-site.xml, we need to add the following code:

<property>
  <name>oozie.credentials.credentialclasses</name>
  <value>
    hcat=org.apache.oozie.action.hadoop.HCatCredentials,
    hbase=org.apache.oozie.action.hadoop.HbaseCredentials,
    hive2=org.apache.oozie.action.hadoop.Hive2Credentials
  </value>
</property>

In workflow.xml, we need to state that we want to use the declared credentials and...

Rerun


Life is not perfect! Every day we have to face failures and same is with Oozie running in production. Jobs fail and we need to rerun them.

Oozie provides a functionality to restart the jobs from intermediate states to save time:

  • To rerun a Coordinator, we need to tell about the action which has failed or the date for which we need to rerun

  • To rerun a Bundle, we need to tell about Coordinator which has failed

Rerun Workflow

To rerun a Workflow that has failed, we have two nodes:

  • oozie.wf.rerun.skip.nodes

  • oozie.wf.rerun.failnodes

oozie.wf.rerun.skip.nodes is the list of nodes to skip, while oozie.wf.rerun.failnodes is a Boolean value that tells if Oozie should run only the failed nodes.

Here's an example of Workflow rerun:

oozie job -rerun 0000003-150921003038748-oozie-oozi-W -Doozie.wf.rerun.failnodes=true

In the preceding example, we passed on the ID of Workflow to rerun.

Rerun Coordinator

To rerun a Coordinator that has failed, we need to tell about the actions to rerun or tell about the...

Summary


In this chapter, we saw how to package and deploy the Oozie code in production. Then we discussed how to configure Oozie code to run in a secured cluster. You also learned about the concept of rerun. I am sure you will be a pro with Oozie.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Apache Oozie Essentials
Published in: Dec 2015Publisher: ISBN-13: 9781785880384
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Jagat Singh

Contacted on 12/01/18 by Davis Anto
Read more about Jagat Singh