Optimizing Hadoop for MapReduce

This book is the perfect introduction to sophisticated concepts in MapReduce and will ensure you have the knowledge to optimize job performance. This is not an academic treatise; it’s an example-driven tutorial for the real world.
Code Files

Optimizing Hadoop for MapReduce

Khaled Tannir

This book is the perfect introduction to sophisticated concepts in MapReduce and will ensure you have the knowledge to optimize job performance. This is not an academic treatise; it’s an example-driven tutorial for the real world.
eBook
$10.00
RRP $20.99
Save 52%
Print + eBook
$34.99
RRP $34.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$10.00
$34.99
RRP $20.99
RRP $34.99
eBook
Print + eBook

Frequently bought together


Optimizing Hadoop for MapReduce Book Cover
Optimizing Hadoop for MapReduce
$ 20.99
$ 10.00
Learning HBase Book Cover
Learning HBase
$ 26.99
$ 10.00
Buy 2 for $20.00
Save $27.98
Add to Cart

Book Details

ISBN 139781783285655
Paperback120 pages

Book Description

MapReduce is the distribution system that the Hadoop MapReduce engine uses to distribute work around a cluster by working parallel on smaller data sets. It is useful in a wide range of applications, including distributed pattern-based searching, distributed sorting, web link-graph reversal, term-vector per host, web access log stats, inverted index construction, document clustering, machine learning, and statistical machine translation.

This book introduces you to advanced MapReduce concepts and teaches you everything from identifying the factors that affect MapReduce job performance to tuning the MapReduce configuration. Based on real-world experience, this book will help you to fully utilize your cluster’s node resources to run MapReduce jobs optimally.

This book details the Hadoop MapReduce job performance optimization process. Through a number of clear and practical steps, it will help you to fully utilize your cluster’s node resources.

Starting with how MapReduce works and the factors that affect MapReduce performance, you will be given an overview of Hadoop metrics and several performance monitoring tools. Further on, you will explore performance counters that help you identify resource bottlenecks, check cluster health, and size your Hadoop cluster. You will also learn about optimizing map and reduce tasks by using Combiners and compression.

The book ends with best practices and recommendations on how to use your Hadoop cluster optimally.

What You Will Learn

  • Learn about the factors that affect MapReduce performance
  • Utilize the Hadoop MapReduce performance counters to identify resource bottlenecks
  • Size your Hadoop cluster’s nodes
  • Set the number of mappers and reducers correctly
  • Optimize mapper and reducer task throughput and code size using compression and Combiners
  • Understand the various tuning properties and best practices to optimize clusters

Authors

Book Details

ISBN 139781783285655
Paperback120 pages
Read More

Read More Reviews

These popular $10 titles might interest you

Learning HBase Book Cover
Learning HBase
$ 26.99
$ 10.00
Hadoop Real-World Solutions Cookbook - Second Edition Book Cover
Hadoop Real-World Solutions Cookbook - Second Edition
$ 43.99
$ 10.00
Hadoop: Data Processing and Modelling Book Cover
Hadoop: Data Processing and Modelling
$ 79.99
$ 10.00
Apache Kafka Cookbook Book Cover
Apache Kafka Cookbook
$ 23.99
$ 10.00
Spring MVC: Beginner's Guide - Second Edition Book Cover
Spring MVC: Beginner's Guide - Second Edition
$ 35.99
$ 10.00
Mastering Apache Spark Book Cover
Mastering Apache Spark
$ 43.99
$ 10.00