Building Hadoop Clusters [Video]

Sean Mikha

Building Hadoop Clusters [Video]
Downloadable video: $39.99
save 15%!

Packt Video. Stream online or download for unrestricted offline use. Learn more

Course Contents
The Author
Sample Clip
  • New! Packt Video courses: practical screencast-based tutorials that show you how to get the job done. Bite sized chunks, hands on instructions, and powerful results.
  • Familiarize yourself with Hadoop and its services, and how to configure them
  • Deploy compute instances and set up a three-node Hadoop cluster on Amazon
  • Set up a Linux installation optimized for Hadoop

Video Details

Language : English
Release Date : Thursday, May 22, 2014
Course Length : 2 hours 34 minutes
ISBN : 178328403X
ISBN 13 : 9781783284030
Author(s) : Sean Mikha
Topics and Technologies : Big Data and Business Intelligence, Video

Table of Contents

  1. Deploying Cloud Instances for Hadoop 2.0
    • Introduction to the Cloud and Hadoop
    • Deploying a Linux Amazon Machine Image
    • Setting Up Amazon Instances

  2. Setting Up Network and Security Settings
    • Network and Security Settings Overview
    • Identifying and Allocating Security Groups
    • Configuration of Private Keys in a Windows Environment

  3. Connecting to Cloud Instances
    • Overview of the Connectivity Options for Windows to the Amazon Cloud
    • Installing and Using Putty for Connectivity to Windows Clients
    • Transferring Files to Linux Nodes with PSCP

  4. Setting Up Network Connectivity and Access for Hadoop Clusters
    • Defining the Hadoop Cluster
    • Setting Up Password-less SSH on the Head Node
    • Gathering Network Details and Setting Up the HOSTS File

  5. Setting Up Configuration Settings across Hadoop Clusters
    • Setting Up Linux Software Repositories
    • Using the Parallel Shell Utility (pdsh)
    • Prepping for Hadoop Installation

  6. Creating a Hadoop Cluster
    • Building a Hadoop Cluster
    • Installing Hadoop 2 – Part 1
    • Installing Hadoop 2 – Part 2

  7. Loading and Navigating the Hadoop File System (HDFS)
    • Understanding the Hadoop File System
    • Loading and Navigating the Hadoop File System
    • Ambari Server and Dashboard

  8. Hadoop Tools and Processing Files
    • Hadoop Tools and Processing Files
    • Installing HUE
    • Using HUE

Sean Mikha

Sean Mikha is a technical architect who specializes in implementing large-scale data warehouses using Massively Parallel Processing (MPP) technologies. Sean has held roles at multiple companies that specialize in MPP technologies, where he was a part of implementing one of the largest commercial clinical data warehouses in the world. Sean is currently a solution architect, focusing on architecting Big Data solutions while also educating customers on Hadoop technologies. Sean graduated from UCLA with a BS in Computer Engineering, and currently lives in Southern California.
Sorry, we don't have any reviews for this video yet.

Sorry, there are currently no downloads available for this video.

Support, complaints and feedback.

Packt is committed to making Packt Video courses a valuable, useful way for IT professionals to learn new skills. We have made every effort to ensure that this course reaches the required standard and will work on our customer's devices. Please go to our support page.

What you will learn from this video course

  • Explore Amazon's Web Services to manage big data
  • Configure network and security settings when deploying instances to the cloud
  • Explore methods to connect to cloud instances using your client machine
  • Set up Linux environments and configure settings for services and package installations
  • Examine Hadoop's general architecture and what each service brings to the table
  • Harness and navigate Hadoop's file storage and processing mechanisms
  • Install and master Apache Hadoop User Interface (HUE)

Who this video course is for

If you are a system administrator or anyone interested in building a Hadoop cluster to process large sets of data, this video course is for you. This video series assumes no prior knowledge of any cloud technologies, Hadoop, or Linux.

In Detail

Hadoop is an Apache top-level project that allows the distributed processing of large data sets across clusters of computers using simple programming models. It allows you to deliver a highly available service on top of a cluster of computers, each of which may be prone to failures. While Big Data and Hadoop have seen a massive surge in popularity over the last few years, many companies still struggle with trying to set up their own computing clusters.

This video series will turn you from a faltering first-timer into a Hadoop pro through clear, concise descriptions that are easy to follow.

We'll begin this course with an overview of Amazon's cloud service and its use. We'll then deploy Linux compute instances and you'll see how to connect your client machine to Linux hosts and configure your systems to run Hadoop. Finally, you'll install Hadoop, download data, and examine how to run a query.

This video series will go beyond just Hadoop; it will cover everything you need to get your own clusters up and running. You will learn how to make network configuration changes as well as modify Linux services. After you've installed Hadoop, we'll then go over installing HUE—Hadoop's UI. Using HUE, you will learn how to download data to your Hadoop clusters, move it to HDFS, and finally query that data with Hive.

Learn everything you need to deploy Hadoop clusters to the Cloud through these videos. You'll grasp all you need to know about handling large data sets over multiple nodes.

Screenshots from the course

Setting Up Amazon Instances

Identifying and Allocating Security Groups

Transferring Files to Linux Nodes with PSCP

Building a Hadoop Cluster

Loading and Navigating the Hadoop File System

Installing HUE


Packt video courses are designed to cover the breadth of the topic in short, hands-on, task-based videos. Each course is divided into short manageable sections, so you can watch the whole thing or jump to the bit you need. The focus is on practical instructions and screencasts showing you how to get the job done.

Packed with explanations for everything you'll need to set up, including simple systematic examples that will get you started with ease

Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software