Hadoop Cluster Deployment

Construct a modern Hadoop data platform effortlessly and gain insights into how to manage clusters efficiently

Hadoop Cluster Deployment

Danil Zburivsky

Construct a modern Hadoop data platform effortlessly and gain insights into how to manage clusters efficiently
Mapt Subscription
FREE
$29.99/m after trial
eBook
$14.70
RRP $20.99
Save 29%
Print + eBook
$34.99
RRP $34.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$0.00
$14.70
$34.99
$29.99p/m after trial
RRP $20.99
RRP $34.99
Subscription
eBook
Print + eBook
Start 30 Day Trial
Subscribe and access every Packt eBook & Video.
 
  • 5,000+ eBooks & Videos
  • 50+ New titles a month
  • 1 Free eBook/Video to keep every month
Start Free Trial
 
Code Files
Preview in Mapt

Book Details

ISBN 139781783281718
Paperback126 pages

Book Description

Big Data is the hottest trend in the IT industry at the moment. Companies are realizing the value of collecting, retaining, and analyzing as much data as possible. They are therefore rushing to implement the next generation of data platform, and Hadoop is the centerpiece of these platforms.

This practical guide is filled with examples which will show you how to successfully build a data platform using Hadoop. Step-by-step instructions will explain how to install, configure, and tie all major Hadoop components together. This book will allow you to avoid common pitfalls, follow best practices, and go beyond the basics when building a Hadoop cluster.

This book will walk you through the process of building a Hadoop cluster from the ground up. By using practical examples and command samples, you will be able to get a cluster up and running in no time, and you will also gain a deep understanding of how various Hadoop components work and interact with each other.

You will learn how to pick the right hardware for different types of Hadoop clusters and about the differences between various Hadoop distributions. By the end of this book, you will be able to install and configure several of the most popular Hadoop ecosystem projects including Hive, Impala, and Sqoop, and you will also be given a sneak peek into the pros and cons of using Hadoop in the cloud.

Table of Contents

Chapter 1: Setting Up Hadoop Cluster – from Hardware to Distribution
Choosing Hadoop cluster hardware
Hadoop distributions
Choosing OS for the Hadoop cluster
Summary
Chapter 2: Installing and Configuring Hadoop
Configuring OS for Hadoop cluster
Setting up NameNode
Summary
Chapter 3: Configuring the Hadoop Ecosystem
Hosting the Hadoop ecosystem
Sqoop
Hive
Impala
Summary
Chapter 4: Securing Hadoop Installation
Hadoop security overview
HDFS security
MapReduce security
Hadoop Service Level Authorization
Hadoop and Kerberos
Summary
Chapter 5: Monitoring Hadoop Cluster
Monitoring strategy overview
Hadoop Metrics
Monitoring MapReduce
Monitoring Hadoop with Ganglia
Summary
Chapter 6: Deploying Hadoop to the Cloud
Amazon Elastic MapReduce
Using Whirr
Summary

What You Will Learn

  • Choose the optimal hardware configuration for your Hadoop cluster
  • Decipher the differences between various Hadoop versions and distributions
  • Make your cluster crash-proof with Namenode High Availability
  • Learn tips and tricks for Jobtracker, Tasktracker, and Datanodes
  • Discover the most important Hadoop ecosystem projects
  • Get more value out of your cluster by using SQL with Hive and real-time query processing with Impala
  • Set up a proper permissions model for your cluster
  • Secure Hadoop with Kerberos
  • Deploy a Hadoop cluster in a cloud environment

 

Authors

Table of Contents

Chapter 1: Setting Up Hadoop Cluster – from Hardware to Distribution
Choosing Hadoop cluster hardware
Hadoop distributions
Choosing OS for the Hadoop cluster
Summary
Chapter 2: Installing and Configuring Hadoop
Configuring OS for Hadoop cluster
Setting up NameNode
Summary
Chapter 3: Configuring the Hadoop Ecosystem
Hosting the Hadoop ecosystem
Sqoop
Hive
Impala
Summary
Chapter 4: Securing Hadoop Installation
Hadoop security overview
HDFS security
MapReduce security
Hadoop Service Level Authorization
Hadoop and Kerberos
Summary
Chapter 5: Monitoring Hadoop Cluster
Monitoring strategy overview
Hadoop Metrics
Monitoring MapReduce
Monitoring Hadoop with Ganglia
Summary
Chapter 6: Deploying Hadoop to the Cloud
Amazon Elastic MapReduce
Using Whirr
Summary

Book Details

ISBN 139781783281718
Paperback126 pages
Read More

Read More Reviews

Recommended for You

Hadoop Real-World Solutions Cookbook Book Cover
Hadoop Real-World Solutions Cookbook
$ 29.99
$ 21.00
Hadoop Beginner's Guide Book Cover
Hadoop Beginner's Guide
$ 29.99
$ 21.00
Hadoop Operations and Cluster Management Cookbook Book Cover
Hadoop Operations and Cluster Management Cookbook
$ 29.99
$ 21.00
Big Data Analytics with R and Hadoop Book Cover
Big Data Analytics with R and Hadoop
$ 29.99
$ 21.00
Storm Real-time Processing Cookbook Book Cover
Storm Real-time Processing Cookbook
$ 29.99
$ 21.00
Hadoop MapReduce Cookbook Book Cover
Hadoop MapReduce Cookbook
$ 29.99
$ 21.00