Apache Oozie Essentials

Unleash the power of Apache Oozie to create and manage your big data and machine learning pipelines in one go

Apache Oozie Essentials

Jagat Jasjit Singh

2 customer reviews
Unleash the power of Apache Oozie to create and manage your big data and machine learning pipelines in one go
Packt Subscription
FREE
$9.99/m after trial
eBook
$19.60
RRP $27.99
Save 29%
Print + eBook
$34.99
RRP $34.99
What do I get with a Packt subscription?
  • Exclusive monthly discount - no contract
  • Unlimited access to entire Packt library of 6500+ eBooks and Videos
  • 120 new titles added every month, on new and emerging tech
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the subscription reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the subscription reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the subscription reader
$0.00
$19.60
$34.99
$9.99 p/m after trial
RRP $27.99
RRP $34.99
Subscription
eBook
Print + eBook
Start a FREE 10-day trial

Frequently bought together


Apache Oozie Essentials Book Cover
Apache Oozie Essentials
$ 27.99
$ 19.60
Hadoop: Data Processing and Modelling Book Cover
Hadoop: Data Processing and Modelling
$ 79.99
$ 56.00
Buy 2 for $75.60
Save $32.38
Add to Cart

Book Details

ISBN 139781785880384
Paperback164 pages

Book Description

As more and more organizations are discovering the use of big data analytics, interest in platforms that provide storage, computation, and analytic capabilities is booming exponentially. This calls for data management. Hadoop caters to this need. Oozie fulfils this necessity for a scheduler for a Hadoop job by acting as a cron to better analyze data.

Apache Oozie Essentials starts off with the basics right from installing and configuring Oozie from source code on your Hadoop cluster to managing your complex clusters. You will learn how to create data ingestion and machine learning workflows.

This book is sprinkled with the examples and exercises to help you take your big data learning to the next level. You will discover how to write workflows to run your MapReduce, Pig ,Hive, and Sqoop scripts and schedule them to run at a specific time or for a specific business requirement using a coordinator. This book has engaging real-life exercises and examples to get you in the thick of things. Lastly, you’ll get a grip of how to embed Spark jobs, which can be used to run your machine learning models on Hadoop.

By the end of the book, you will have a good knowledge of Apache Oozie. You will be capable of using Oozie to handle large Hadoop workflows and even improve the availability of your Hadoop environment.

What You Will Learn

  • Install and configure Oozie from source code on your Hadoop cluster
  • Dive into the world of Oozie with Java MapReduce jobs
  • Schedule Hive ETL and data ingestion jobs
  • Import data from a database through Sqoop jobs in HDFS
  • Create and process data pipelines with Pig, hive scripts as per business requirements.
  • Run machine learning Spark jobs on Hadoop
  • Create quick Oozie jobs using Hue
  • Make the most of Oozie’s security capabilities by configuring Oozie’s security

Authors

Book Details

ISBN 139781785880384
Paperback164 pages
Read More
From 2 reviews

Read More Reviews

Recommended for You

Hadoop: Data Processing and Modelling Book Cover
Hadoop: Data Processing and Modelling
$ 79.99
$ 56.00
Apache Kafka 1.0 Cookbook Book Cover
Apache Kafka 1.0 Cookbook
$ 27.99
$ 19.60
Building Data Streaming Applications with Apache Kafka Book Cover
Building Data Streaming Applications with Apache Kafka
$ 35.99
$ 25.20
Hadoop 2.x Administration Cookbook Book Cover
Hadoop 2.x Administration Cookbook
$ 39.99
$ 28.00
Data Lake for Enterprises Book Cover
Data Lake for Enterprises
$ 18.49
$ 12.95
Hadoop 2.x Administration Cookbook Book Cover
Hadoop 2.x Administration Cookbook
$ 39.99
$ 28.00