Mastering Hadoop

More Information
Learn
  • Understand the changes involved in the process in the move from Hadoop 1.0 to Hadoop 2.0
  • Customize and optimize MapReduce jobs in Hadoop 2.0
  • Explore Hadoop I/O and different data formats
  • Dive into YARN and Storm and use YARN to integrate Storm with Hadoop
  • Deploy Hadoop on Amazon Elastic MapReduce
  • Discover HDFS replacements and learn about HDFS Federation
  • Get to grips with Hadoop's main security aspects
  • Utilize Mahout and RHadoop for Hadoop analytics
About

Hadoop is synonymous with Big Data processing. Its simple programming model, "code once and deploy at any scale" paradigm, and an ever-growing ecosystem makes Hadoop an all-encompassing platform for programmers with different levels of expertise.

This book explores the industry guidelines to optimize MapReduce jobs and higher-level abstractions such as Pig and Hive in Hadoop 2.0. Then, it dives deep into Hadoop 2.0 specific features such as YARN and HDFS Federation.

This book is a step-by-step guide that focuses on advanced Hadoop concepts and aims to take your Hadoop knowledge and skill set to the next level. The data processing flow dictates the order of the concepts in each chapter, and each chapter is illustrated with code fragments or schematic diagrams.

Features
  • Learn how to optimize Hadoop MapReduce, Pig and Hive
  • Dive into YARN and learn how it can integrate Storm with Hadoop
  • Understand how Hadoop can be deployed on the cloud and gain insights into analytics with Hadoop
Page Count 374
Course Length 11 hours 13 minutes
ISBN 9781783983643
Date Of Publication 29 Dec 2014

Authors

Sandeep Karanth

Sandeep Karanth is a technical architect who specializes in building and operationalizing software systems. He has more than 14 years of experience in the software industry, working on a gamut of products ranging from enterprise data applications to newer-generation mobile applications. He has primarily worked at Microsoft Corporation in Redmond, Microsoft Research in India, and is currently a cofounder at Scibler, architecting data intelligence products.