Reader small image

You're reading from  Apache Flume: Distributed Log Collection for Hadoop

Product typeBook
Published inFeb 2015
Reading LevelIntermediate
Publisher
ISBN-139781784392178
Edition1st Edition
Languages
Right arrow
Author (1)
Steven Hoffman
Steven Hoffman
author image
Steven Hoffman

Steve Hoffman has 32 years of experience in software development, ranging from embedded software development to the design and implementation of large-scale, service-oriented, object-oriented systems. For the last 5 years, he has focused on infrastructure as code, including automated Hadoop and HBase implementations and data ingestion using Apache Flume. Steve holds a BS in computer engineering from the University of Illinois at Urbana-Champaign and an MS in computer science from DePaul University. He is currently a senior principal engineer at Orbitz Worldwide (http://orbitz.com/). More information on Steve can be found at http://bit.ly/bacoboy and on Twitter at @bacoboy. This is the first update to Steve's first book, Apache Flume: Distributed Log Collection for Hadoop, Packt Publishing.
Read more about Steven Hoffman

Right arrow

The file channel


A file channel is a channel that stores events to the local filesystem of the agent. Though it's slower than the memory channel, it provides a durable storage path that can survive most issues and should be used in use cases where a gap in your data flow is undesirable.

This durability is provided by a combination of a Write Ahead Log (WAL) and one or more file storage directories. The WAL is used to track all input and output from the channel in an atomically safe way. This way, if the agent is restarted, the WAL can be replayed to make sure all the events that came into the channel (puts) have been written out (takes) before the stored data can be purged from the local filesystem.

Additionally, the file channel supports the encryption of data written to the filesystem if your data handling policy requires that all data on the disk (even temporarily) be encrypted. I won't cover this here, but should you need it, there is an example in the Flume User Guide (http://flume.apache...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Apache Flume: Distributed Log Collection for Hadoop
Published in: Feb 2015Publisher: ISBN-13: 9781784392178

Author (1)

author image
Steven Hoffman

Steve Hoffman has 32 years of experience in software development, ranging from embedded software development to the design and implementation of large-scale, service-oriented, object-oriented systems. For the last 5 years, he has focused on infrastructure as code, including automated Hadoop and HBase implementations and data ingestion using Apache Flume. Steve holds a BS in computer engineering from the University of Illinois at Urbana-Champaign and an MS in computer science from DePaul University. He is currently a senior principal engineer at Orbitz Worldwide (http://orbitz.com/). More information on Steve can be found at http://bit.ly/bacoboy and on Twitter at @bacoboy. This is the first update to Steve's first book, Apache Flume: Distributed Log Collection for Hadoop, Packt Publishing.
Read more about Steven Hoffman