Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Learning Hadoop 2

You're reading from  Learning Hadoop 2

Product type Book
Published in Feb 2015
Publisher Packt
ISBN-13 9781783285518
Pages 382 pages
Edition 1st Edition
Languages

Table of Contents (18) Chapters

Learning Hadoop 2
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Introduction Storage Processing – MapReduce and Beyond Real-time Computation with Samza Iterative Computation with Spark Data Analysis with Apache Pig Hadoop and SQL Data Lifecycle Management Making Development Easier Running a Hadoop Cluster Where to Go Next Index

Summary


Hopefully, this chapter presented the topic of data life cycle management as something other than a dry abstract concept. We covered a lot, particularly:

  • The definition of data life cycle management and how it covers a number of issues and techniques that usually become important with large data volumes

  • The concept of building a data ingest pipeline along good data life cycle management principles that can then be utilized by higher-level analytic tools

  • Oozie as a Hadoop-focused workflow manager and how we can use it to compose a series of actions into a unified workflow

  • Various Oozie tools, such as subworkflows, parallel action execution, and global variables, that allow us to apply true design principles to our workflows

  • HCatalog and how it provides the means for tools other than Hive to read and write table-structured data; we showed its great promise and integration with tools such as Pig but also highlighted some current weaknesses

  • Avro as our tool of choice to handle schema evolution...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}