From 0 to 1: Hive for Processing Big Data [Video]

Preview in Mapt

From 0 to 1: Hive for Processing Big Data [Video]

Loonycorn

End-to-End Hive: HQL, Partitioning, Bucketing, UDFs, Windowing, Optimization, Map Joins, Indexes

Quick links: > What will you learn?> Table of content

Video
$42.50
RRP $49.99
Save 14%
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$42.50
RRP $49.99

Frequently bought together


From 0 to 1: Hive for Processing Big Data [Video] Book Cover
From 0 to 1: Hive for Processing Big Data [Video]
$ 49.99
$ 42.50
From 0 to 1: Data Structures & Algorithms in Java [Video] Book Cover
From 0 to 1: Data Structures & Algorithms in Java [Video]
$ 32.99
$ 28.05
Buy 2 for $35.00
Save $47.98
Add to Cart

Video Details

ISBN 139781788995054
Course Length15 hours 16 minutes

Video Description

Hive is like a new friend with an old face (SQL). This course is an end-to-end, practical guide to using Hive for Big Data processing. Let's parse that A new friend with an old face: Hive helps you leverage the power of Distributed computing and Hadoop for Analytical processing. Its interface is like an old friend: the very SQL like HiveQL. This course will fill in all the gaps between SQL and what you need to use Hive. End-to-End: The course is an end-to-end guide for using Hive: whether you are analyst who wants to process data or an Engineer who needs to build custom functionality or optimize performance - everything you'll need is right here. New to SQL? No need to look elsewhere. The course has a primer on all the basic SQL constructs, Practical: Everything is taught using real-life examples, working queries and code.

Style and Approach

A 15- hour course which gives you a very detailed coverage of topics, excellent graphics used to explain concepts.

Table of Contents

You, Us & This Course
You, Us & This Course
Introducing Hive
Hive: An Open-Source Data Warehouse
Hive and Hadoop
Hive vs Traditional Relational DBMS
HiveQL and SQL
Hadoop and Hive Install
Hadoop Install Modes
Hadoop Install Step 1: Standalone Mode
Hadoop Install Step 2: Pseudo-Distributed Mode
Hive install
Code-Along: Getting started
Hadoop and HDFS Overview
What is Hadoop?
HDFS or the Hadoop Distributed File System
Hive Basics
Primitive Datatypes
Collections_Arrays_Maps
Structs and Unions
Create Table
Insert Into Table
Insert into Table 2
Alter Table
HDFS
HDFS CLI - Interacting with HDFS
Code-Along: Create Table
Code-Along: Hive CLI
Built-in Functions
Three types of Hive functions
The Case-When statement, the Size function, the Cast function
The Explode function
Code-Along: Hive Built - in functions
Sub-Queries
Quirky Sub-Queries
More on subqueries: Exists and In
Inserting via subqueries
Code-Along: Use Subqueries to work with Collection Datatypes
Views
Partitioning
Indices
Partitioning Introduced
The Rationale for Partitioning
How Tables are partitioned
Using Partitioned Tables
Dynamic Partitioning: Inserting data into partitioned tables
Code-Along: Partitioning
Bucketing
Introducing Bucketing
The Advantages of Bucketing
How Tables are bucketed
Using Bucketed Tables
Sampling
Windowing
Windowing Introduced
Windowing - A Simple Example: Cumulative Sum
Windowing - A More Involved Example: Partitioning
Windowing - Special Aggregation Functions
Understanding MapReduce
The basic philosophy underlying MapReduce
MapReduce - Visualized and Explained
MapReduce - Digging a little deeper at every step
MapReduce logic for queries: Behind the scenes
MapReduce Overview: Basic Select-From-Where
MapReduce Overview: Group-By and Having
MapReduce Overview: Joins
Join Optimizations in Hive
Improving Join performance with tables of different sizes
The Where clause in Joins
The Left Semi Join
Map Side Joins: The Inner Join
Map Side Joins: The Left, Right and Full Outer Joins
Map Side Joins: The Bucketed Map Join and the Sorted Merge Join
Custom Functions in Python
Custom functions in Python
Code-Along: Custom Function in Python
Custom functions in Java
Introducing UDFs - you're not limited by what Hive offer
The Simple UDF: The standard function for primitive types
The Simple UDF: Java implementation for replacetext()
Generic UDFs, the Object Inspector and DeferredObjects
The Generic UDF: Java implementation for containsstring()
The UDAF: Custom aggregate functions can get pretty complex
The UDAF: Java implementation for max()
The UDAF: Java implementation for Standard Deviation
The Generic UDTF: Custom table generating functions
The Generic UDTF: Java implementation for namesplit()
SQL Primer - Select Statements
Select Statements
Select Statements 2
Operator Functions
SQL Primer - Group By, Order by and Having
Aggregation Operators Introduced
The Group by Clause
More Group by Examples
Order by
Having
SQL Primer – Joins
Introduction to SQL Joins
Cross Joins and Cartesian Joins
Inner Joins
Left Outer Joins
Right, Full Outer Joins, Natural Joins, Self Joins
Appendix
[For Linux/Mac OS Shell Newbies] Path and other Environment Variables

What You Will Learn

  • Write complex analytical queries on data in Hive and uncover insights
  • Leverage ideas of partitioning, bucketing to optimize queries in Hive
  • Customize hive with user-defined functions in Java and Python
  • Understand what goes on under the hood of Hive with HDFS and MapReduce

Authors

Table of Contents

You, Us & This Course
You, Us & This Course
Introducing Hive
Hive: An Open-Source Data Warehouse
Hive and Hadoop
Hive vs Traditional Relational DBMS
HiveQL and SQL
Hadoop and Hive Install
Hadoop Install Modes
Hadoop Install Step 1: Standalone Mode
Hadoop Install Step 2: Pseudo-Distributed Mode
Hive install
Code-Along: Getting started
Hadoop and HDFS Overview
What is Hadoop?
HDFS or the Hadoop Distributed File System
Hive Basics
Primitive Datatypes
Collections_Arrays_Maps
Structs and Unions
Create Table
Insert Into Table
Insert into Table 2
Alter Table
HDFS
HDFS CLI - Interacting with HDFS
Code-Along: Create Table
Code-Along: Hive CLI
Built-in Functions
Three types of Hive functions
The Case-When statement, the Size function, the Cast function
The Explode function
Code-Along: Hive Built - in functions
Sub-Queries
Quirky Sub-Queries
More on subqueries: Exists and In
Inserting via subqueries
Code-Along: Use Subqueries to work with Collection Datatypes
Views
Partitioning
Indices
Partitioning Introduced
The Rationale for Partitioning
How Tables are partitioned
Using Partitioned Tables
Dynamic Partitioning: Inserting data into partitioned tables
Code-Along: Partitioning
Bucketing
Introducing Bucketing
The Advantages of Bucketing
How Tables are bucketed
Using Bucketed Tables
Sampling
Windowing
Windowing Introduced
Windowing - A Simple Example: Cumulative Sum
Windowing - A More Involved Example: Partitioning
Windowing - Special Aggregation Functions
Understanding MapReduce
The basic philosophy underlying MapReduce
MapReduce - Visualized and Explained
MapReduce - Digging a little deeper at every step
MapReduce logic for queries: Behind the scenes
MapReduce Overview: Basic Select-From-Where
MapReduce Overview: Group-By and Having
MapReduce Overview: Joins
Join Optimizations in Hive
Improving Join performance with tables of different sizes
The Where clause in Joins
The Left Semi Join
Map Side Joins: The Inner Join
Map Side Joins: The Left, Right and Full Outer Joins
Map Side Joins: The Bucketed Map Join and the Sorted Merge Join
Custom Functions in Python
Custom functions in Python
Code-Along: Custom Function in Python
Custom functions in Java
Introducing UDFs - you're not limited by what Hive offer
The Simple UDF: The standard function for primitive types
The Simple UDF: Java implementation for replacetext()
Generic UDFs, the Object Inspector and DeferredObjects
The Generic UDF: Java implementation for containsstring()
The UDAF: Custom aggregate functions can get pretty complex
The UDAF: Java implementation for max()
The UDAF: Java implementation for Standard Deviation
The Generic UDTF: Custom table generating functions
The Generic UDTF: Java implementation for namesplit()
SQL Primer - Select Statements
Select Statements
Select Statements 2
Operator Functions
SQL Primer - Group By, Order by and Having
Aggregation Operators Introduced
The Group by Clause
More Group by Examples
Order by
Having
SQL Primer – Joins
Introduction to SQL Joins
Cross Joins and Cartesian Joins
Inner Joins
Left Outer Joins
Right, Full Outer Joins, Natural Joins, Self Joins
Appendix
[For Linux/Mac OS Shell Newbies] Path and other Environment Variables

Video Details

ISBN 139781788995054
Course Length15 hours 16 minutes
Read More

Read More Reviews

Recommended for You

From 0 to 1: Data Structures & Algorithms in Java [Video] Book Cover
From 0 to 1: Data Structures & Algorithms in Java [Video]
$ 32.99
$ 28.05
From 0 to 1 : Spark for Data Science with Python [Video] Book Cover
From 0 to 1 : Spark for Data Science with Python [Video]
$ 32.99
$ 28.05
From 0 to 1: Machine Learning, NLP & Python-Cut to the Chase [Video] Book Cover
From 0 to 1: Machine Learning, NLP & Python-Cut to the Chase [Video]
$ 32.99
$ 28.05
Learn Algorithms and Data Structures in Java for Day-to-Day Applications [Video] Book Cover
Learn Algorithms and Data Structures in Java for Day-to-Day Applications [Video]
$ 124.99
$ 106.25
Tensorflow Solutions for Data [Video] Book Cover
Tensorflow Solutions for Data [Video]
$ 124.99
$ 106.25
Data Visualization Solutions for Beginners [Video] Book Cover
Data Visualization Solutions for Beginners [Video]
$ 124.99
$ 106.25