Learning Hadoop 2 [Video]

Learning Hadoop 2 [Video]

This video is included in a Mapt subscription
Randal Scott King

1 customer reviews
An introduction to storing, structuring, and analyzing data at scale with Hadoop
$0.00
$63.75
$29.99p/m after trial
RRP $74.99
Subscription
Video
Start 30 Day Trial
Subscribe and access every Packt eBook & Video.
 
  • 4,000+ eBooks & Videos
  • 40+ New titles a month
  • 1 Free eBook/Video to keep every month
Start Free Trial
 
Preview in Mapt

Video Details

ISBN 139781785888113
Course Length1 hour and 30 minutes

Video Description

Hadoop emerged in response to the proliferation of masses and masses of data collected by organizations, offering a strong solution to store, process, and analyze what has commonly become known as Big Data. It comprises a comprehensive stack of components designed to enable these tasks on a distributed scale, across multiple servers and thousands of machines. 

Learning Hadoop 2 introduces you to the powerful system synonymous with Big Data, demonstrating how to create an instance and leverage Hadoop ecosystem's many components to store, process, manage, and query massive data sets with confidence.

We open this course by providing an overview of the Hadoop component ecosystem, including HDFS, Sqoop, Flume, YARN, MapReduce, Pig, and Hive, before installing and configuring our Hadoop environment. We take a look at Hue, the graphical user interface of Hadoop.

We will then discover HDFS, Hadoop’s file-system used to store data. We will learn how to import and export data, both manually and automatically. Afterward, we turn our attention toward running computations using MapReduce, and get to grips working with Hadoop’s scripting language, Pig. Lastly, we will siphon data from HDFS into Hive, and demonstrate how it can be used to structure and query data sets.

Style and Approach

Low on theory, high on practice, this introduction to Hadoop delivers step-by-step guidance on setting up an instance, and working with each of the components of the Hadoop ecosystem. By the end of this course, you will be capable of implementing an Hadoop instance, storing, processing, and analyzing data with the framework.

Table of Contents

The Hadoop Ecosystem
The Course Overview
Overview of HDFS and YARN
Overview of Sqoop and Flume
Overview of MapReduce
Overview of Pig
Overview of Hive
Installing and Configuring Hadoop
Downloading and Installing Hadoop
Exploring Hue
Data Import and Export
Manual Import
Importing from Databases Using Sqoop
Using Flume to Import Streaming Data
Using MapReduce and Pig
Coding "Word Count" in MapReduce
Coding "Word Count" in Pig
Performing Common ETL Functions in Pig
Using User-defined Functions in Pig
Using Hive
Importing Data from HDFS into Hive
Importing Data Directly from a Database
Performing Basic Queries in Hive
Putting It All Together

What You Will Learn

  • Install and configure an Hadoop instance of your own
  • Navigate Hue, the GUI for common tasks in Hadoop
  • Import data manually, and automatically from a database 
  • Build scripts with Pig to perform common ETL tasks
  • Write and run a simple MapReduce program
  • Structure and query data effectively with Hive, Hadoop’s built-in data warehousing component

Authors

Screenshots

Table of Contents

The Hadoop Ecosystem
The Course Overview
Overview of HDFS and YARN
Overview of Sqoop and Flume
Overview of MapReduce
Overview of Pig
Overview of Hive
Installing and Configuring Hadoop
Downloading and Installing Hadoop
Exploring Hue
Data Import and Export
Manual Import
Importing from Databases Using Sqoop
Using Flume to Import Streaming Data
Using MapReduce and Pig
Coding "Word Count" in MapReduce
Coding "Word Count" in Pig
Performing Common ETL Functions in Pig
Using User-defined Functions in Pig
Using Hive
Importing Data from HDFS into Hive
Importing Data Directly from a Database
Performing Basic Queries in Hive
Putting It All Together

Video Details

ISBN 139781785888113
Course Length1 hour and 30 minutes
Read More
From 1 reviews

Read More Reviews