Learning Cascading

More Information
Learn
  • Familiarize yourself with tuples, pipes, taps, and flows and build your first Cascading application
  • Discover how to design, develop, and use custom operations
  • Design, develop, use, and reuse code with subassemblies and Cascades
  • Acquire the skills you need to integrate Cascading with external systems
  • Gain expertise in testing, QA, and performance tuning to run an efficient and successful Cascading project
  • Explore project management methodologies and steps to develop workable solutions
  • Discover the future of big data frameworks and understand how Cascading can help your software to evolve with it
  • Uncover sources of additional information and other tools that can make development tasks a lot easier
About

Cascading is open source software that is used to create and execute complex data processing workflows on big data clusters. The book starts by explaining how Cascading relates to core big data technologies such as Hadoop MapReduce. Having instilled an understanding of the technology, the book provides a comprehensive introduction to the Cascading paradigm and its components using code examples. You will not only learn more advanced Cascading features, you will also write code to utilize them. Furthermore, you will gain in-depth knowledge of how to efficiently optimize a Cascading application. To deepen your knowledge and experience with Cascading, you will work through a real-life case study using Natural Language Processing to perform text analysis and search on large volumes of unstructured text. Throughout the book, you will receive expert advice on how to use the portions of the product that are undocumented or have limited documentation. By the end of the book, you will be able to build practical Cascading applications.

Features
  • Understand how Cascading fits into the big data landscape and hides the complexity of MapReduce to enable the development of streamlined, maintainable, and concise applications
  • Develop a real-life Cascading application that can be easily customized for your specific needs
  • Learn basic and advanced features of Cascading through a practical, hands-on approach with step-by-step instructions and code samples
Page Count 276
Course Length 8 hours 16 minutes
ISBN 9781785288913
Date Of Publication 28 May 2015

Authors

Michael Covert

Michael Covert, CEO, Analytics Inside LLC, has significant experience in a variety of business and technical roles. Michael is a mathematician and computer scientist and is involved in machine learning, deep learning, predictive analytics, graph theory, and big data. He earned a bachelor's of science degree in mathematics with honors and distinction from The Ohio State University. He also attended it as a PhD student, specializing in machine learning and high-performance computing. Michael is a Cloudera Hadoop Certified Developer.

Michael served as the vice president of performance management in Whittman-Hart, Inc., based in Chicago, and as the chief operating officer of Infinis, Inc., a business intelligence consulting company based in Columbus, Ohio. Infinis merged with Whittman-Hart in 2005. Prior to working at Infinis, Michael was the vice president of product development and chief technology officer at Alta Analytics, and the producer of data mining and visualization software. In addition to this, he has served in technology management roles for Claremont Technology Group, Inc., where he was the director of advanced technology.

Victoria Loewengart

Victoria Loewengart, COO, Analytics Inside LLC, is an innovative software systems architect with a proven record of bringing emerging technologies to clients through discovery, design, and integration. Additionally, Victoria spent a large part of her career developing software technologies that extract information from unstructured text. Victoria has published numerous articles on topics ranging from text analytics to intelligence analysis and cyber security. Her book An Introduction to Hacking & Crimeware: A Pocket Guide was published by IT Governance, UK, in January 2012. Victoria earned a bachelor's degree in computer science from Purdue University and a master's degree in intelligence studies from the American Military University.