Apache Accumulo for Developers

Discover how to build Accumulo, Hadoop, and ZooKeeper clusters from scratch on both Windows and Linux. With this book’s examples-based approach, you’ll learn the painless way through clear instructions and real-world exercises.

Apache Accumulo for Developers

Starting
Guðmundur Jón Halldórsson

Discover how to build Accumulo, Hadoop, and ZooKeeper clusters from scratch on both Windows and Linux. With this book’s examples-based approach, you’ll learn the painless way through clear instructions and real-world exercises.
$20.99
$34.99
RRP $20.99
RRP $34.99
eBook
Print + eBook
$12.99 p/month

Want this title & more? Subscribe to PacktLib

Enjoy full and instant access to over 2000 books and videos – you’ll find everything you need to stay ahead of the curve and make sure you can always get the job done.
+ Collection
Free Sample

Book Details

ISBN 139781783285990
Paperback120 pages

About This Book

  • Shows you how to build Accumulo, Hadoop, and ZooKeeper clusters from scratch on both Windows and Linux
  • Allows you to get hands-on knowledge about how to run Accumulo on Amazon EC2, Google Cloud Platform, Rackspace, and Windows Azure Cloud platforms
  • Packed with practical examples to enable you to manipulate Accumulo with ease

Who This Book Is For

This book is great for developers new to Accumulo, who are looking to get a good grounding in how to use Accumulo. It’s assumed that you have an understanding of how Hadoop works, both HDFS and the Map/Reduce. No prior knowledge of ZooKeeper is assumed.

Table of Contents

Chapter 1: Building an Accumulo Cluster from Scratch
Necessary requirements
Setting up Cygwin
Setting up Hadoop
Setting up ZooKeeper
Setting up and configuring Accumulo
Starting the Accumulo cluster
Connecting to the Accumulo cluster using Java
Summary
Chapter 2: Monitoring and Managing Accumulo
Monitoring
Elasticity
Failover
Resource management
Summary
Chapter 3: Integrating Accumulo into Various Cloud Platforms
Amazon EC2
Google Cloud Platform
Rackspace
Windows Azure
Summary
Chapter 4: Optimizing Accumulo Performance
Prerequisites
Hadoop performance
ZooKeeper performance
Accumulo performance
Summary
Chapter 5: Security
Visibility
Security expression
Authorization
User authorizations
Handling secure authorization
Query Services Layer
Summary

What You Will Learn

  • Set up Hadoop, ZooKeeper, and Accumulo
  • Monitor clusters - both performance and application logs
  • Secure your data in Accumulo
  • Optimize Hadoop, ZooKeeper, and Accumulo performance
  • Integrate to various cloud platforms
  • Use the Accumulo command-line shell
  • Employ Ganglina to monitor the cluster and Graylog2 to monitor application logs
  • Understand what tools are needed to optimize Accumulo performance

In Detail

Accumulo is a sorted and distributed key/value store designed to handle large amounts of data. Being highly robust and scalable, its performance makes it ideal for real-time data storage. Apache Accumulo is based on Google's BigTable design and is built on top of Apache Hadoop, Zookeeper, and Thrift.

Apache Accumulo for Developers is your guide to building an Accumulo cluster both as a single-node and multi-node, on-site and in the cloud. Accumulo has been proven to be able to handle petabytes of data, with cell-level security, and real-time analyses so this is your step by step guide in taking full advantage of this power.

Apache Accumulo for Developers looks at the process of setting up three systems - Hadoop, ZooKeeper, and Accumulo – and configuring, monitoring, and securing them.

You will learn to connect Accumulo to both Hadoop and ZooKeeper. You will also learn how to monitor the cluster (single-node or multi-node) to find any performance bottlenecks, and then integrate to Amazon EC2, Google Cloud Platform, Rackspace, and Windows Azure. When integrating with these cloud platforms, we will focus on scripting as well.

You will also learn to troubleshoot clusters with monitoring tools, and use Accumulo cell-level security to secure your data.

Authors

Table of Contents

Chapter 1: Building an Accumulo Cluster from Scratch
Necessary requirements
Setting up Cygwin
Setting up Hadoop
Setting up ZooKeeper
Setting up and configuring Accumulo
Starting the Accumulo cluster
Connecting to the Accumulo cluster using Java
Summary
Chapter 2: Monitoring and Managing Accumulo
Monitoring
Elasticity
Failover
Resource management
Summary
Chapter 3: Integrating Accumulo into Various Cloud Platforms
Amazon EC2
Google Cloud Platform
Rackspace
Windows Azure
Summary
Chapter 4: Optimizing Accumulo Performance
Prerequisites
Hadoop performance
ZooKeeper performance
Accumulo performance
Summary
Chapter 5: Security
Visibility
Security expression
Authorization
User authorizations
Handling secure authorization
Query Services Layer
Summary

Book Details

ISBN 139781783285990
Paperback120 pages
Read More