Learning Cascading

Build reliable, robust, and high-performance big data applications using the Cascading application development efficiently

Learning Cascading

Michael Covert, Victoria Loewengart

Build reliable, robust, and high-performance big data applications using the Cascading application development efficiently
Mapt Subscription
FREE
$29.99/m after trial
eBook
$25.20
RRP $35.99
Print + eBook
$44.99
RRP $44.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$0.00
$25.20
$44.99
$29.99p/m after trial
RRP $35.99
RRP $44.99
Subscription
eBook
Print + eBook
Start 30 Day Trial
Subscribe and access every Packt eBook & Video.
 
  • 5,000+ eBooks & Videos
  • 50+ New titles a month
  • 1 Free eBook/Video to keep every month
Start Free Trial
 
Preview in Mapt

Book Details

ISBN 139781785288913
Paperback276 pages

Book Description

Cascading is open source software that is used to create and execute complex data processing workflows on big data clusters. The book starts by explaining how Cascading relates to core big data technologies such as Hadoop MapReduce. Having instilled an understanding of the technology, the book provides a comprehensive introduction to the Cascading paradigm and its components using code examples. You will not only learn more advanced Cascading features, you will also write code to utilize them. Furthermore, you will gain in-depth knowledge of how to efficiently optimize a Cascading application. To deepen your knowledge and experience with Cascading, you will work through a real-life case study using Natural Language Processing to perform text analysis and search on large volumes of unstructured text. Throughout the book, you will receive expert advice on how to use the portions of the product that are undocumented or have limited documentation. By the end of the book, you will be able to build practical Cascading applications.

Table of Contents

Chapter 1: The Big Data Core Technology Stack
Reviewing Hadoop
MapReduce execution framework
The Cascading framework
Summary
Chapter 2: Cascading Basics in Detail
Understanding common Cascading themes
Understanding how Cascading represents records
Understanding how Cascading controls data flow
Putting it all together
Summary
Chapter 3: Understanding Custom Operations
Understanding operations
Summary
Chapter 4: Creating Custom Operations
Writing custom operations
Identifying common use cases for custom operations
Summary
Chapter 5: Code Reuse and Integration
Creating and using subassemblies
Using cascades
Dynamically controlling flows
Integrating external components
Summary
Chapter 6: Testing a Cascading Application
Debugging a Cascading application
Testing strategies
Summary
Chapter 7: Optimizing the Performance of a Cascading Application
Optimizing performance
Summary
Chapter 8: Creating a Real-world Application in Cascading
Project description – Business Intelligence case study on monitoring the competition
Project scope – understanding requirements
Defining the project – the Cascading development methodology
Building the workflow
Next steps
Summary
Chapter 9: Planning for Future Growth
Finding online resources
Using other Cascading tools
Custom taps
Cascading serializers
Java open source mock frameworks
Summary

What You Will Learn

  • Familiarize yourself with tuples, pipes, taps, and flows and build your first Cascading application
  • Discover how to design, develop, and use custom operations
  • Design, develop, use, and reuse code with subassemblies and Cascades
  • Acquire the skills you need to integrate Cascading with external systems
  • Gain expertise in testing, QA, and performance tuning to run an efficient and successful Cascading project
  • Explore project management methodologies and steps to develop workable solutions
  • Discover the future of big data frameworks and understand how Cascading can help your software to evolve with it
  • Uncover sources of additional information and other tools that can make development tasks a lot easier

Authors

Table of Contents

Chapter 1: The Big Data Core Technology Stack
Reviewing Hadoop
MapReduce execution framework
The Cascading framework
Summary
Chapter 2: Cascading Basics in Detail
Understanding common Cascading themes
Understanding how Cascading represents records
Understanding how Cascading controls data flow
Putting it all together
Summary
Chapter 3: Understanding Custom Operations
Understanding operations
Summary
Chapter 4: Creating Custom Operations
Writing custom operations
Identifying common use cases for custom operations
Summary
Chapter 5: Code Reuse and Integration
Creating and using subassemblies
Using cascades
Dynamically controlling flows
Integrating external components
Summary
Chapter 6: Testing a Cascading Application
Debugging a Cascading application
Testing strategies
Summary
Chapter 7: Optimizing the Performance of a Cascading Application
Optimizing performance
Summary
Chapter 8: Creating a Real-world Application in Cascading
Project description – Business Intelligence case study on monitoring the competition
Project scope – understanding requirements
Defining the project – the Cascading development methodology
Building the workflow
Next steps
Summary
Chapter 9: Planning for Future Growth
Finding online resources
Using other Cascading tools
Custom taps
Cascading serializers
Java open source mock frameworks
Summary

Book Details

ISBN 139781785288913
Paperback276 pages
Read More

Read More Reviews