Pig Design Patterns
|Also available on:|
- Quickly understand how to use Pig to design end-to-end Big Data systems
- Implement a hands-on programming approach using design patterns to solve commonly occurring enterprise Big Data challenges
- Enhances users’ capabilities to utilize Pig and create their own design patterns wherever applicable
Book DetailsLanguage : English
Paperback : 310 pages [ 235mm x 191mm ]
Release Date : April 2014
ISBN : 1783285559
ISBN 13 : 9781783285556
Author(s) : Pradeep Pasupuleti
Topics and Technologies : All Books, Big Data and Business Intelligence, Open Source
Table of Contents
Chapter 1: Setting the Context for Design Patterns in Pig
Chapter 2: Data Ingest and Egress Patterns
Chapter 3: Data Profiling Patterns
Chapter 4: Data Validation and Cleansing Patterns
Chapter 5: Data Transformation Patterns
Chapter 6: Understanding Data Reduction Patterns
Chapter 7: Advanced Patterns and Future Work
Download the code and support files for this book.
Please let us know if you have found any errors not listed on this list by completing our errata submission form. Our editors will check them and add them to this list. Thank you.
What you will learn from this book
- Understand Pig's relevance in an enterprise context
- Use Pig in design patterns that enable data movement across platforms during and after analytical processing
- See how Pig can co-exist with other components of the Hadoop ecosystem to create Big Data solutions using design patterns
- Simplify the process of creating complex data pipelines using transformations, aggregations, enrichment, cleansing, filtering, reformatting, lookups, and data type conversions
- Apply knowledge of Pig in design patterns that deal with integration of Hadoop with other systems to enable multi-platform analytics
- Comprehend design patterns and use Pig in cases related to complex analysis of pure structured data
Pig Design Patterns is a comprehensive guide that will enable readers to readily use design patterns that simplify the creation of complex data pipelines in various stages of data management. This book focuses on using Pig in an enterprise context, bridging the gap between theoretical understanding and practical implementation. Each chapter contains a set of design patterns that pose and then solve technical challenges that are relevant to the enterprise use cases.
The book covers the journey of Big Data from the time it enters the enterprise to its eventual use in analytics, in the form of a report or a predictive model. By the end of the book, readers will appreciate Pig's real power in addressing each and every problem encountered when creating an analytics-based data product. Each design pattern comes with a suggested solution, analyzing the trade-offs of implementing the solution in a different way, explaining how the code works, and the results.
A comprehensive practical guide that walks you through the multiple stages of data management in enterprise and gives you numerous design patterns with appropriate code examples to solve frequent problems in each of these stages. The chapters are organized to mimick the sequential data flow evidenced in Analytics platforms, but they can also be read independently to solve a particular group of problems in the Big Data life cycle.
Who this book is for
The experienced developer who is already familiar with Pig and is looking for a use case standpoint where they can relate to the problems of data ingestion, profiling, cleansing, transforming, and egressing data encountered in the enterprises. Knowledge of Hadoop and Pig is necessary for readers to grasp the intricacies of Pig design patterns better.