Hadoop MapReduce v2 Cookbook - Second Edition

More Information
  • Configure and administer Hadoop YARN, MapReduce v2, and HDFS clusters
  • Use Hive, HBase, Pig, Mahout, and Nutch with Hadoop v2 to solve your big data problems easily and effectively
  • Solve large-scale analytics problems using MapReduce-based applications
  • Tackle complex problems such as classifications, finding relationships, online marketing, recommendations, and searching using Hadoop MapReduce and other related projects
  • Perform massive text data processing using Hadoop MapReduce and other related projects
  • Deploy your clusters to cloud environments

Starting with installing Hadoop YARN, MapReduce, HDFS, and other Hadoop ecosystem components, with this book, you will soon learn about many exciting topics such as MapReduce patterns, using Hadoop to solve analytics, classifications, online marketing, recommendations, and data indexing and searching. You will learn how to take advantage of Hadoop ecosystem projects including Hive, HBase, Pig, Mahout, Nutch, and Giraph and be introduced to deploying in cloud environments.

Finally, you will be able to apply the knowledge you have gained to your own real-world scenarios to achieve the best-possible results.

  • Process large and complex datasets using next generation Hadoop
  • Install, configure, and administer MapReduce programs and learn what’s new in MapReduce v2
  • More than 90 Hadoop MapReduce recipes presented in a simple and straightforward manner, with step-by-step instructions and real-world examples
Page Count 322
Course Length 9 hours 39 minutes
ISBN 9781783285471
Date Of Publication 25 Feb 2015


Thilina Gunarathne

Thilina Gunarathne is a senior data scientist at KPMG LLP. He led the Hadoop-related efforts at Link Analytics before its acquisition by KPMG LLP. He has extensive experience in using Apache Hadoop and its related technologies for large-scale data-intensive computations. He coauthored the first edition of this book, Hadoop MapReduce Cookbook, with Dr. Srinath Perera.

Thilina has contributed to several open source projects at Apache Software Foundation as a member, committer, and a PMC member. He has also published many peer-reviewed research articles on how to extend the MapReduce model to perform efficient data mining and data analytics computations in the cloud. Thilina received his PhD and MSc degrees in computer science from Indiana University, Bloomington, USA, and received his bachelor of science degree in computer science and engineering from University of Moratuwa, Sri Lanka.