Apache Hive Cookbook

Easy, hands-on recipes to help you understand Hive and its integration with frameworks that are used widely in today's big data world
Preview in Mapt

Apache Hive Cookbook

Hanish Bansal, Saurabh Chauhan, Shrey Mehrotra

3 customer reviews
Easy, hands-on recipes to help you understand Hive and its integration with frameworks that are used widely in today's big data world
Mapt Subscription
FREE
$29.99/m after trial
eBook
$25.20
RRP $35.99
Save 29%
Print + eBook
$44.99
RRP $44.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$0.00
$25.20
$44.99
$29.99 p/m after trial
RRP $35.99
RRP $44.99
Subscription
eBook
Print + eBook
Start 30 Day Trial

Frequently bought together


Apache Hive Cookbook Book Cover
Apache Hive Cookbook
$ 35.99
$ 25.20
Apache Camel Essentials Book Cover
Apache Camel Essentials
$ 19.99
$ 14.00
Buy 2 for $31.50
Save $24.48
Add to Cart

Book Details

ISBN 139781782161080
Paperback268 pages

Book Description

Hive was developed by Facebook and later open sourced in Apache community. Hive provides SQL like interface to run queries on Big Data frameworks. Hive provides SQL like syntax also called as HiveQL that includes all SQL capabilities like analytical functions which are the need of the hour in today’s Big Data world.

This book provides you easy installation steps with different types of metastores supported by Hive. This book has simple and easy to learn recipes for configuring Hive clients and services. You would also learn different Hive optimizations including Partitions and Bucketing. The book also covers the source code explanation of latest Hive version.

Hive Query Language is being used by other frameworks including spark. Towards the end you will cover integration of Hive with these frameworks.

Table of Contents

Chapter 1: Developing Hive
Introduction
Deploying Hive on a Hadoop cluster
Deploying Hive Metastore
Installing Hive
Configuring HCatalog
Understanding different components of Hive
Compiling Hive from source
Hive packages
Debugging Hive
Running Hive
Changing configurations at runtime
Chapter 2: Services in Hive
Introducing HiveServer2
Understanding HiveServer2 properties
Configuring HiveServer2 high availability
Using HiveServer2 clients
Introducing the Hive metastore service
Configuring high availability of metastore service
Introducing Hue
Chapter 3: Understanding the Hive Data Model
Introduction
Using numeric data types
Using string data types
Using Date/Time data types
Using miscellaneous data types
Using complex data types
Using operators
Partitioning
Partitioning a managed table
Partitioning an external table
Bucketing
Chapter 4: Hive Data Definition Language
Introduction
Creating a database schema
Dropping a database schema
Altering a database schema
Using a database schema
Showing database schemas
Describing a database schema
Creating tables
Dropping tables
Truncating tables
Renaming tables
Altering table properties
Creating views
Dropping views
Altering the view properties
Altering the view as select
Showing tables
Showing partitions
Show the table properties
Showing create table
HCatalog
WebHCat
Chapter 5: Hive Data Manipulation Language
Introduction
Loading files into tables
Inserting data into Hive tables from queries
Inserting data into dynamic partitions
Writing data into files from queries
Enabling transactions in Hive
Inserting values into tables from SQL
Updating data
Deleting data
Chapter 6: Hive Extensibility Features
Introduction
Serialization and deserialization formats and data types
Exploring views
Exploring indexes
Hive partitioning
Creating buckets in Hive
Analytics functions in Hive
Windowing in Hive
File formats
Chapter 7: Joins and Join Optimization
Understanding the joins concept
Using a left/right/full outer join
Using a left semi join
Using a cross join
Using a map-side join
Using a bucket map join
Using a bucket sort merge map join
Using a skew join
Chapter 8: Statistics in Hive
Bringing statistics in to Hive
Table and partition statistics in Hive
Column statistics in Hive
Top K statistics in Hive
Chapter 9: Functions in Hive
Using built-in functions
Using the built-in User-defined Aggregation Function (UDAF)
Using the built-in User Defined Table Function (UDTF)
Creating custom User-Defined Functions (UDF)
Chapter 10: Hive Tuning
Enabling predicate pushdown optimizations in Hive
Optimizations to reduce the number of map
Sampling
Chapter 11: Hive Security
Securing Hadoop
Authorizing Hive
Configuring the SQL standards-based authorization
Authenticating Hive
Chapter 12: Hive Integration with Other Frameworks
Working with Apache Spark
Working with Accumulo
Working with HBase
Working with Google Drill

What You Will Learn

  • Learn different features and offering on the latest Hive
  • Understand the working and structure of the Hive internals
  • Get an insight on the latest development in Hive framework
  • Grasp the concepts of Hive Data Model
  • Master the key concepts like Partition, Buckets and Statistics
  • Know how to integrate Hive with other frameworks such as Spark, Accumulo, etc

Authors

Table of Contents

Chapter 1: Developing Hive
Introduction
Deploying Hive on a Hadoop cluster
Deploying Hive Metastore
Installing Hive
Configuring HCatalog
Understanding different components of Hive
Compiling Hive from source
Hive packages
Debugging Hive
Running Hive
Changing configurations at runtime
Chapter 2: Services in Hive
Introducing HiveServer2
Understanding HiveServer2 properties
Configuring HiveServer2 high availability
Using HiveServer2 clients
Introducing the Hive metastore service
Configuring high availability of metastore service
Introducing Hue
Chapter 3: Understanding the Hive Data Model
Introduction
Using numeric data types
Using string data types
Using Date/Time data types
Using miscellaneous data types
Using complex data types
Using operators
Partitioning
Partitioning a managed table
Partitioning an external table
Bucketing
Chapter 4: Hive Data Definition Language
Introduction
Creating a database schema
Dropping a database schema
Altering a database schema
Using a database schema
Showing database schemas
Describing a database schema
Creating tables
Dropping tables
Truncating tables
Renaming tables
Altering table properties
Creating views
Dropping views
Altering the view properties
Altering the view as select
Showing tables
Showing partitions
Show the table properties
Showing create table
HCatalog
WebHCat
Chapter 5: Hive Data Manipulation Language
Introduction
Loading files into tables
Inserting data into Hive tables from queries
Inserting data into dynamic partitions
Writing data into files from queries
Enabling transactions in Hive
Inserting values into tables from SQL
Updating data
Deleting data
Chapter 6: Hive Extensibility Features
Introduction
Serialization and deserialization formats and data types
Exploring views
Exploring indexes
Hive partitioning
Creating buckets in Hive
Analytics functions in Hive
Windowing in Hive
File formats
Chapter 7: Joins and Join Optimization
Understanding the joins concept
Using a left/right/full outer join
Using a left semi join
Using a cross join
Using a map-side join
Using a bucket map join
Using a bucket sort merge map join
Using a skew join
Chapter 8: Statistics in Hive
Bringing statistics in to Hive
Table and partition statistics in Hive
Column statistics in Hive
Top K statistics in Hive
Chapter 9: Functions in Hive
Using built-in functions
Using the built-in User-defined Aggregation Function (UDAF)
Using the built-in User Defined Table Function (UDTF)
Creating custom User-Defined Functions (UDF)
Chapter 10: Hive Tuning
Enabling predicate pushdown optimizations in Hive
Optimizations to reduce the number of map
Sampling
Chapter 11: Hive Security
Securing Hadoop
Authorizing Hive
Configuring the SQL standards-based authorization
Authenticating Hive
Chapter 12: Hive Integration with Other Frameworks
Working with Apache Spark
Working with Accumulo
Working with HBase
Working with Google Drill

Book Details

ISBN 139781782161080
Paperback268 pages
Read More
From 3 reviews

Read More Reviews

Recommended for You

Apache Hive Essentials Book Cover
Apache Hive Essentials
$ 23.99
$ 16.80
Hadoop Real-World Solutions Cookbook - Second Edition Book Cover
Hadoop Real-World Solutions Cookbook - Second Edition
$ 43.99
$ 30.80
Hadoop: Data Processing and Modelling Book Cover
Hadoop: Data Processing and Modelling
$ 69.99
$ 49.00
Spark Cookbook Book Cover
Spark Cookbook
$ 35.99
$ 25.20
Apache Spark 2 for Beginners Book Cover
Apache Spark 2 for Beginners
$ 31.99
$ 22.40
Hadoop 2.x Administration Cookbook Book Cover
Hadoop 2.x Administration Cookbook
$ 39.99
$ 28.00