Free Sample
+ Collection
Code Files

Building Hadoop Clusters [Video]

Sean Mikha

Deploy multi-node Hadoop clusters to harness the Cloud for storage and large-scale data processing
RRP $84.99

Want this title & more?

$12.99 p/month

Subscribe to PacktLib

Enjoy full and instant access to over 2000 books and videos – you’ll find everything you need to stay ahead of the curve and make sure you can always get the job done.

Video Details

ISBN 139781783284030
Course Length2 hours 34 minutes

About This Video

  • Familiarize yourself with Hadoop and its services, and how to configure them
  • Deploy compute instances and set up a three-node Hadoop cluster on Amazon
  • Set up a Linux installation optimized for Hadoop

Who This Video Is For

If you are a system administrator or anyone interested in building a Hadoop cluster to process large sets of data, this video course is for you. This video series assumes no prior knowledge of any cloud technologies, Hadoop, or Linux.

Table of Contents

Deploying Cloud Instances for Hadoop 2.0
Introduction to the Cloud and Hadoop
Deploying a Linux Amazon Machine Image
Setting Up Amazon Instances
Setting Up Network and Security Settings
Network and Security Settings Overview
Identifying and Allocating Security Groups
Configuration of Private Keys in a Windows Environment
Connecting to Cloud Instances
Overview of the Connectivity Options for Windows to the Amazon Cloud
Installing and Using Putty for Connectivity to Windows Clients
Transferring Files to Linux Nodes with PSCP
Setting Up Network Connectivity and Access for Hadoop Clusters
Defining the Hadoop Cluster
Setting Up Password-less SSH on the Head Node
Gathering Network Details and Setting Up the HOSTS File
Setting Up Configuration Settings across Hadoop Clusters
Setting Up Linux Software Repositories
Using the Parallel Shell Utility (pdsh)
Prepping for Hadoop Installation
Creating a Hadoop Cluster
Building a Hadoop Cluster
Installing Hadoop 2 – Part 1
Installing Hadoop 2 – Part 2
Loading and Navigating the Hadoop File System (HDFS)
Understanding the Hadoop File System
Loading and Navigating the Hadoop File System
Ambari Server and Dashboard
Hadoop Tools and Processing Files
Hadoop Tools and Processing Files
Installing HUE
Using HUE

What You Will Learn

  • Explore Amazon's Web Services to manage big data
  • Configure network and security settings when deploying instances to the cloud
  • Explore methods to connect to cloud instances using your client machine
  • Set up Linux environments and configure settings for services and package installations
  • Examine Hadoop's general architecture and what each service brings to the table
  • Harness and navigate Hadoop's file storage and processing mechanisms
  • Install and master Apache Hadoop User Interface (HUE)

In Detail

Hadoop is an Apache top-level project that allows the distributed processing of large data sets across clusters of computers using simple programming models. It allows you to deliver a highly available service on top of a cluster of computers, each of which may be prone to failures. While Big Data and Hadoop have seen a massive surge in popularity over the last few years, many companies still struggle with trying to set up their own computing clusters.

This video series will turn you from a faltering first-timer into a Hadoop pro through clear, concise descriptions that are easy to follow.

We'll begin this course with an overview of Amazon's cloud service and its use. We'll then deploy Linux compute instances and you'll see how to connect your client machine to Linux hosts and configure your systems to run Hadoop. Finally, you'll install Hadoop, download data, and examine how to run a query.

This video series will go beyond just Hadoop; it will cover everything you need to get your own clusters up and running. You will learn how to make network configuration changes as well as modify Linux services. After you've installed Hadoop, we'll then go over installing HUE—Hadoop's UI. Using HUE, you will learn how to download data to your Hadoop clusters, move it to HDFS, and finally query that data with Hive.

Learn everything you need to deploy Hadoop clusters to the Cloud through these videos. You'll grasp all you need to know about handling large data sets over multiple nodes.



Read More