Reader small image

You're reading from  Monitoring Hadoop

Product typeBook
Published inApr 2015
Publisher
ISBN-139781783281558
Edition1st Edition
Tools
Right arrow
Author (1)
Aman Singh
Aman Singh
author image
Aman Singh

Gurmukh Singh is a seasoned technology professional with 14+ years of industry experience in infrastructure design, distributed systems, performance optimization, and networks. He has worked in big data domain for the last 5 years and provides consultancy and training on various technologies. He has worked with companies such as HP, JP Morgan, and Yahoo. He has authored Monitoring Hadoop by Packt Publishing
Read more about Aman Singh

Right arrow

YARN framework


The YARN (Yet Another Resource Negotiator) is the new MapReduce framework. It is designed to scale for large clusters and performs much better as compared to the old framework. There are new sets of daemons in the new framework, and it is good to understand how they communicate with each other. The following diagram explains the daemons and ports on which they talk:

Common issues faced on Hadoop cluster

With a distributed framework of the scale of Hadoop, many things can go wrong. It is not possible to capture all the issues that could occur, but from a monitoring perspective, we can list the things that are common and can be monitored easily. The following table tries to capture the common issues faced in Hadoop:

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Monitoring Hadoop
Published in: Apr 2015Publisher: ISBN-13: 9781783281558

Author (1)

author image
Aman Singh

Gurmukh Singh is a seasoned technology professional with 14+ years of industry experience in infrastructure design, distributed systems, performance optimization, and networks. He has worked in big data domain for the last 5 years and provides consultancy and training on various technologies. He has worked with companies such as HP, JP Morgan, and Yahoo. He has authored Monitoring Hadoop by Packt Publishing
Read more about Aman Singh

Issue

Description and steps that could help

High CPU utilization

This could be due to high query rate or faulty job. Use top command to find the offending processes. On NameNode, it could be due to a large number of handlers or DataNodes sending block reports at...