Reader small image

You're reading from  Apache Oozie Essentials

Product typeBook
Published inDec 2015
Reading LevelIntermediate
Publisher
ISBN-139781785880384
Edition1st Edition
Languages
Right arrow
Author (1)
Jagat Singh
Jagat Singh
author image
Jagat Singh

Contacted on 12/01/18 by Davis Anto
Read more about Jagat Singh

Right arrow

Configuring Oozie in Hortonworks distribution


In this section, we will learn how to configure Oozie inside Hortonworks Hadoop distribution using Ambari. We will configure the Oozie server to use a MySQL database instead of the default Derby database to store all job information.

We will use a virtual machine to learn how to configure Oozie in Hortonworks Hadoop distribution. Most of other distributions, such as Cloudera, Pivotal, and so on, have similar steps.

Let's start with the following steps:

  1. If you don't have VirtualBox on your machine, then download and install VirtualBox from https://www.virtualbox.org/wiki/Downloads.

  2. Download the Hortonworks single node virtual machine from http://hortonworks.com/hdp/downloads/. It will take 1-2 hours depending upon your Internet connection speed.

    Tip

    It is always good to store the virtual machine images in a common folder. For example, I have folder in my machine such as ~/dev/vm/. It makes virtual machine image management easier.

  3. After the download is complete, open the VirtualBox and click on File | Import Appliance:

    Import appliance

  4. Click on the Import Appliance button, browse to the place where you downloaded the virtual machine image, and then click on Continue.

  5. Wait till the VirtualBox imports the new machine.

  6. Once you can see the machine is imported, click on Start machine in the virtual machine console.

  7. On completion of boot process of the machine, you can log in to the Ambari dashboard by opening the URL http://127.0.0.1:8080 in your browser.

  8. Use the username as well as password as admin.

    It will take some time for all services to start up and report their status to Ambari. Once the system has reported the status, all services have a glance at the Ambari console. It is also a good idea to stop the services which we are not using to reduce the load on the system.

  9. In the Ambari dashboard, click on the link named Oozie on the left side. You can see there are two components for Oozie, Oozie Server and Oozie Client. Since we are using a single node cluster, we have both the server and client installed on the same machine. In the production environment, you will configure the Oozie server and clients separately on different machines. Using the client, we will submit the jobs to server. Before submitting the job, we will tell where the server is located using the OOZIE_URL variable.

    Tip

    To save time in manually specifying the Oozie server on the client machine every time, you can set the environment variable OOZIE_URL in your bash_profile or environment file depending on the operating system you use. You should say export OOZIE_URL=http://oozieserver:11000/oozie; in this book oozieserver will be localhost.

  10. Now click on the Config link at the top and we will configure the database as MySQL. The Oozie server will use MySQL to store the job information:

    Ambari Oozie configuration

  11. You may notice, at this moment, the server has been configured to use a Derby database. Derby is good for playing and testing, but not for running the production sever. We will configure it to use a MySQL-based database.

  12. Log in to the virtual machine using SSH as follows:

    $ ssh root@127.0.0.1 -p 2222
    

    The default password is hadoop.

  13. After you log in to the SSH session, log in to MySQL:

    $ mysql -u root
    
  14. Since this is a test virtual machine, the password is not configured. In production, you will be having password protection.

  15. At the MySQL prompt, execute the following SQL statements:

    CREATE USER 'oozie'@'%' IDENTIFIED BY 'hadoop';
    CREATE DATABASE oozie;
    GRANT ALL PRIVILEGES ON oozie.* TO 'oozie'@'%' WITH GRANT OPTION;
    

    The following output will be generated:

    Oozie database creation

  16. To make Oozie work with MySQL, we need to get driver for it. Let's download the MySQL JDBC driver from the MySQL JDBC jar download section. Extract the jar to a folder such as /root/mysql inside the virtual machine:

    $ cd ~/
    $ mkdir mysql
    $ cd mysql
    $ # Download the MySQL JDBC Driver
    $ wget http://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.36.tar.gz
    $ # Extract tar
    $ tar -xvf mysql-connector-java-5.1.36.tar.gz
    $ # Tell Ambari that we got new MYSQL JDBC driver which it can use
    $ ambari-server setup --jdbc-db=mysql --jdbc-driver=/root/mysql/mysql-connector-java-5.1.36/mysql-connector-java-5.1.36-bin.jar
    
  17. In the Ambari dashboard, configure the MySQL database with the following details:

    Field name

    Value

    Database Name

    oozie

    Database Username

    oozie

    Database Password

    hadoop

    JDBC Driver Class

    com.mysql.jdbc.Driver

    JDBC Database URL

    jdbc:mysql://localhost:3306/${oozie.db.schema.name}?createDatabaseIfNotExist=true

  18. In the Ambari dashboard page, click on Test Connection. If all is good, there should be a green tick. So, we have now configured the Oozie server to use MySQL database instead of Derby.

  19. Finally, to confirm that Oozie works properly, in another browser tab open the Oozie dashboard by entering the URL http://127.0.0.1:11000/oozie.

This completes the first section in which we learned how to configure Oozie for Hortonworks Ambari distribution.

Previous PageNext Page
You have been reading a chapter from
Apache Oozie Essentials
Published in: Dec 2015Publisher: ISBN-13: 9781785880384
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Jagat Singh

Contacted on 12/01/18 by Davis Anto
Read more about Jagat Singh