Reader small image

You're reading from  Apache Flume: Distributed Log Collection for Hadoop

Product typeBook
Published inFeb 2015
Reading LevelIntermediate
Publisher
ISBN-139781784392178
Edition1st Edition
Languages
Right arrow
Author (1)
Steven Hoffman
Steven Hoffman
author image
Steven Hoffman

Steve Hoffman has 32 years of experience in software development, ranging from embedded software development to the design and implementation of large-scale, service-oriented, object-oriented systems. For the last 5 years, he has focused on infrastructure as code, including automated Hadoop and HBase implementations and data ingestion using Apache Flume. Steve holds a BS in computer engineering from the University of Illinois at Urbana-Champaign and an MS in computer science from DePaul University. He is currently a senior principal engineer at Orbitz Worldwide (http://orbitz.com/). More information on Steve can be found at http://bit.ly/bacoboy and on Twitter at @bacoboy. This is the first update to Steve's first book, Apache Flume: Distributed Log Collection for Hadoop, Packt Publishing.
Read more about Steven Hoffman

Right arrow

Chapter 8. Monitoring Flume

The user guide for Flume states:

Monitoring in Flume is still a work in progress. Changes can happen very often. Several Flume components report metrics to the JMX platform MBean server. These metrics can be queried using Jconsole.

While JMX is fine for casual browsing of metric values, the number of eyeballs looking at Jconsole doesn't scale when you have hundreds or even thousands of servers sending data all over the place. What you need is a way to watch everything at once. However, what are the important things to look for? That is a very difficult question, but I'll try and cover several of the items that are important, as we cover monitoring options in this chapter.

Monitoring the agent process


The most obvious type of monitoring you'll want to perform is Flume agent process monitoring, that is, making sure the agent is still running. There are many products that do this kind of process monitoring, so there is no way we can cover them all. If you work at a company of any reasonable size, chances are there is already a system in place for this. If this is the case, do not go off and build your own. The last thing operations wants is yet another screen to watch 24/7.

Monit

If you do not already have something in place, one freemium option is Monit (http://mmonit.com/monit/). The developers of Monit have a paid version that provides more bells and whistles you may want to consider. Even in the free form, it can provide you with a way to check whether the Flume agent is running, restart it if it isn't, and send you an e-mail when this happens so that you can look into why it died.

Monit does much more, but this functionality is what we will cover here. If...

Monitoring performance metrics


Now that we have covered some options for process monitoring, how do you know whether your application is actually doing the work you think it is? On many occasions, I've seen a stuck syslog-ng process that appears to be running, but it just wasn't sending any data. I'm not picking on syslog-ng specifically; all software does this when conditions that are not designed for occur.

When talking about Flume data flows, you need to monitor the following:

  • Data entering sources is within expected rates

  • Data isn't overflowing your channels

  • Data is exiting sinks at expected rates

Flume has a pluggable monitoring framework, but as mentioned at the beginning of the chapter, it is still very much a work in progress. This does not mean you shouldn't use it, as that would be foolish. It means you'll want to prepare extra testing and integration time anytime you upgrade.

Note

While not covered in the Flume documentation, it is common to enable JMX in your Flume JVM (http://bit.ly...

Summary


In this chapter, we covered monitoring Flume agents both from the process level and the monitoring of internal metrics (whether it is working).

Monit and Nagios were introduced as open source options for process watching.

Next, we covered the Flume agent internal monitoring metrics with Ganglia and JSON over HTTP implementations that ship with Apache Flume.

Finally, we covered how to integrate a custom monitoring implementation if you need to directly integrate to some other tool that's not supported by Flume by default.

In our final chapter, we will discuss some general considerations for your Flume deployment.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Apache Flume: Distributed Log Collection for Hadoop
Published in: Feb 2015Publisher: ISBN-13: 9781784392178
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Steven Hoffman

Steve Hoffman has 32 years of experience in software development, ranging from embedded software development to the design and implementation of large-scale, service-oriented, object-oriented systems. For the last 5 years, he has focused on infrastructure as code, including automated Hadoop and HBase implementations and data ingestion using Apache Flume. Steve holds a BS in computer engineering from the University of Illinois at Urbana-Champaign and an MS in computer science from DePaul University. He is currently a senior principal engineer at Orbitz Worldwide (http://orbitz.com/). More information on Steve can be found at http://bit.ly/bacoboy and on Twitter at @bacoboy. This is the first update to Steve's first book, Apache Flume: Distributed Log Collection for Hadoop, Packt Publishing.
Read more about Steven Hoffman