Apache Hive Essentials

Immerse yourself on a fantastic journey to discover the attributes of big data by using Hive
Preview in Mapt

Apache Hive Essentials

Dayong Du

Immerse yourself on a fantastic journey to discover the attributes of big data by using Hive
Mapt Subscription
FREE
$29.99/m after trial
eBook
$16.80
RRP $23.99
Save 29%
Print + eBook
$39.99
RRP $39.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$0.00
$16.80
$39.99
$29.99 p/m after trial
RRP $23.99
RRP $39.99
Subscription
eBook
Print + eBook
Start 30 Day Trial

Frequently bought together


Apache Hive Essentials Book Cover
Apache Hive Essentials
$ 23.99
$ 16.80
Apache Camel Essentials Book Cover
Apache Camel Essentials
$ 19.99
$ 14.00
Buy 2 for $30.80
Save $13.18
Add to Cart

Book Details

ISBN 139781783558575
Paperback208 pages

Book Description

In this book, we prepare you for your journey into big data by firstly introducing you to backgrounds in the big data domain along with the process of setting up and getting familiar with your Hive working environment. Next, the book guides you through discovering and transforming the values of big data with the help of examples. It also hones your skill in using the Hive language in an efficient manner. Towards the end, the book focuses on advanced topics such as performance, security, and extensions in Hive, which will guide you on exciting adventures on this worthwhile big data journey.

By the end of the book, you will be familiar with Hive and able to work efficiently to find solutions to big data problems.

Table of Contents

Chapter 1: Overview of Big Data and Hive
A short history
Introducing big data
Relational and NoSQL database versus Hadoop
Batch, real-time, and stream processing
Overview of the Hadoop ecosystem
Hive overview
Summary
Chapter 2: Setting Up the Hive Environment
Installing Hive from Apache
Installing Hive from vendor packages
Starting Hive in the cloud
Using the Hive command line and Beeline
The Hive-integrated development environment
Summary
Chapter 3: Data Definition and Description
Understanding Hive data types
Data type conversions
Hive Data Definition Language
Hive database
Hive internal and external tables
Hive partitions
Hive buckets
Hive views
Summary
Chapter 4: Data Selection and Scope
The SELECT statement
The INNER JOIN statement
The OUTER JOIN and CROSS JOIN statements
Special JOIN – MAPJOIN
Set operation – UNION ALL
Summary
Chapter 5: Data Manipulation
Data exchange – LOAD
Data exchange – INSERT
Data exchange – EXPORT and IMPORT
ORDER and SORT
Operators and functions
Transactions
Summary
Chapter 6: Data Aggregation and Sampling
Basic aggregation – GROUP BY
Advanced aggregation – GROUPING SETS
Advanced aggregation – ROLLUP and CUBE
Aggregation condition – HAVING
Analytic functions
Sampling
Summary
Chapter 7: Performance Considerations
Performance utilities
Design optimization
Data file optimization
Job and query optimization
Summary
Chapter 8: Extensibility Considerations
User-defined functions
Streaming
SerDe
Summary
Chapter 9: Security Considerations
Authentication
Authorization
Encryption
Summary
Chapter 10: Working with Other Tools
JDBC / ODBC connector
HBase
Hue
HCatalog
ZooKeeper
Oozie
Hive roadmap
Summary

What You Will Learn

  • Create and set up the Hive environment
  • Discover how to use Hive's definition language to describe data
  • Discover interesting data by joining and filtering datasets in Hive
  • Transform data by using Hive sorting, ordering, and functions
  • Aggregate and sample data in different ways
  • Boost Hive query performance and enhance data security in Hive
  • Customize Hive to your needs by using user-defined functions and integrate it with other tools

Authors

Table of Contents

Chapter 1: Overview of Big Data and Hive
A short history
Introducing big data
Relational and NoSQL database versus Hadoop
Batch, real-time, and stream processing
Overview of the Hadoop ecosystem
Hive overview
Summary
Chapter 2: Setting Up the Hive Environment
Installing Hive from Apache
Installing Hive from vendor packages
Starting Hive in the cloud
Using the Hive command line and Beeline
The Hive-integrated development environment
Summary
Chapter 3: Data Definition and Description
Understanding Hive data types
Data type conversions
Hive Data Definition Language
Hive database
Hive internal and external tables
Hive partitions
Hive buckets
Hive views
Summary
Chapter 4: Data Selection and Scope
The SELECT statement
The INNER JOIN statement
The OUTER JOIN and CROSS JOIN statements
Special JOIN – MAPJOIN
Set operation – UNION ALL
Summary
Chapter 5: Data Manipulation
Data exchange – LOAD
Data exchange – INSERT
Data exchange – EXPORT and IMPORT
ORDER and SORT
Operators and functions
Transactions
Summary
Chapter 6: Data Aggregation and Sampling
Basic aggregation – GROUP BY
Advanced aggregation – GROUPING SETS
Advanced aggregation – ROLLUP and CUBE
Aggregation condition – HAVING
Analytic functions
Sampling
Summary
Chapter 7: Performance Considerations
Performance utilities
Design optimization
Data file optimization
Job and query optimization
Summary
Chapter 8: Extensibility Considerations
User-defined functions
Streaming
SerDe
Summary
Chapter 9: Security Considerations
Authentication
Authorization
Encryption
Summary
Chapter 10: Working with Other Tools
JDBC / ODBC connector
HBase
Hue
HCatalog
ZooKeeper
Oozie
Hive roadmap
Summary

Book Details

ISBN 139781783558575
Paperback208 pages
Read More

Read More Reviews

Recommended for You

Learning HBase Book Cover
Learning HBase
$ 26.99
$ 18.90
Apache ZooKeeper Essentials Book Cover
Apache ZooKeeper Essentials
$ 17.99
$ 12.60
Learning Apache Kafka - Second Edition Book Cover
Learning Apache Kafka - Second Edition
$ 20.99
$ 14.70
Hadoop: Data Processing and Modelling Book Cover
Hadoop: Data Processing and Modelling
$ 69.99
$ 49.00
Real-Time Big Data Analytics Book Cover
Real-Time Big Data Analytics
$ 35.99
$ 25.20
Building Machine Learning Systems with Python Book Cover
Building Machine Learning Systems with Python
$ 29.99
$ 6.00