Real-time Analytics with Storm and Cassandra

Solve real-time analytics problems effectively using Storm and Cassandra

Real-time Analytics with Storm and Cassandra

Learning
Shilpi

Solve real-time analytics problems effectively using Storm and Cassandra
$35.99
$44.99
RRP $35.99
RRP $44.99
eBook
Print + eBook

Instantly access this course right now and get the skills you need in 2017

With unlimited access to a constantly growing library of over 4,000 eBooks and Videos, a subscription to Mapt gives you everything you need to get that next promotion or to land that dream job. Cancel anytime.

Free Sample

Book Details

ISBN 139781784395490
Paperback220 pages

Book Description

This book will teach you how to use Storm for real-time data processing and to make your applications highly available with no downtime using Cassandra.

The book starts off with the basics of Storm and its components along with setting up the environment for the execution of a Storm topology in local and distributed mode. Moving on, you will explore the Storm and Zookeeper configurations, understand the Storm UI, set up Storm clusters, and monitor Storm clusters using various tools. You will then add NoSQL persistence to Storm and set up a Cassandra cluster. You will do all this while being guided by the best practices for Storm and Cassandra applications. Next, you will learn about data partitioning and consistent hashing in Cassandra through examples and also see high availability features and replication in Cassandra. Finally, you'll learn about different methods that you can use to manage and maintain Cassandra and Storm.

Table of Contents

Chapter 1: Let's Understand Storm
Distributed computing problems
Solutions for complex distributed use cases
A high-level view of various components of Storm
Delving into the internals of Storm
Quiz time
Summary
Chapter 2: Getting Started with Your First Topology
Prerequisites for setting up Storm
Components of a Storm topology
Executing a sample Storm topology – local mode
Executing the topology in the distributed mode
Executing the topology from Command Prompt
Quiz time
Summary
Chapter 3: Understanding Storm Internals by Examples
Customizing Storm spouts
Anchoring and acking
Stream groupings
Quiz time
Summary
Chapter 4: Storm in a Clustered Mode
The Storm cluster setup
Zookeeper configurations
Storm configurations
Storm monitoring tools
Quiz time
Summary
Chapter 5: Storm High Availability and Failover
An overview of RabbitMQ
Installing the RabbitMQ cluster
Integrating Storm with RabbitMQ
Building high availability of components
The Storm isolation scheduler
Quiz time
Summary
Chapter 6: Adding NoSQL Persistence to Storm
The advantages of Cassandra
Columnar database fundamentals
Setting up the Cassandra cluster
Multiple data centers
Introduction to CQLSH
Introduction to CLI
Using different client APIs to access Cassandra
Storm topology wired to the Cassandra store
The best practices for Storm/Cassandra applications
Quiz time
Summary
Chapter 7: Cassandra Partitioning, High Availability, and Consistency
Consistent hashing
Replication in Cassandra and strategies
Cassandra consistency
Quiz time
Summary
Chapter 8: Cassandra Management and Maintenance
Cassandra – gossip protocol
Cassandra cluster scaling – adding a new node
Cassandra cluster – replacing a dead node
The replication factor
The nodetool commands
Cassandra fault tolerance
Cassandra monitoring systems
Quiz time
Summary
Chapter 9: Storm Management and Maintenance
Scaling the Storm cluster – adding new supervisor nodes
Scaling the Storm cluster and rebalancing the topology
Setting up workers and parallelism to enhance processing
Storm troubleshooting
Quiz time
Summary
Chapter 10: Advance Concepts in Storm
Building a Trident topology
Understanding the Trident API
Examples and illustrations
Quiz time
Summary
Chapter 11: Distributed Cache and CEP with Storm
The need for distributed caching in Storm
Introduction to memcached
Introduction to the complex event processing engine
Quiz time
Summary

What You Will Learn

  • Integrate Storm applications with RabbitMQ for real-time analysis and processing of messages
  • Monitor highly distributed applications using Nagios
  • Integrate the Cassandra data store with Storm
  • Develop and maintain distributed Storm applications in conjunction with Cassandra and In Memory Database (memcache)
  • Build a Trident topology that enables real-time computing with Storm
  • Tune performance for Storm topologies based on the SLA and requirements of the application
  • Use Esper with the Storm framework for rapid development of applications

Authors

Table of Contents

Chapter 1: Let's Understand Storm
Distributed computing problems
Solutions for complex distributed use cases
A high-level view of various components of Storm
Delving into the internals of Storm
Quiz time
Summary
Chapter 2: Getting Started with Your First Topology
Prerequisites for setting up Storm
Components of a Storm topology
Executing a sample Storm topology – local mode
Executing the topology in the distributed mode
Executing the topology from Command Prompt
Quiz time
Summary
Chapter 3: Understanding Storm Internals by Examples
Customizing Storm spouts
Anchoring and acking
Stream groupings
Quiz time
Summary
Chapter 4: Storm in a Clustered Mode
The Storm cluster setup
Zookeeper configurations
Storm configurations
Storm monitoring tools
Quiz time
Summary
Chapter 5: Storm High Availability and Failover
An overview of RabbitMQ
Installing the RabbitMQ cluster
Integrating Storm with RabbitMQ
Building high availability of components
The Storm isolation scheduler
Quiz time
Summary
Chapter 6: Adding NoSQL Persistence to Storm
The advantages of Cassandra
Columnar database fundamentals
Setting up the Cassandra cluster
Multiple data centers
Introduction to CQLSH
Introduction to CLI
Using different client APIs to access Cassandra
Storm topology wired to the Cassandra store
The best practices for Storm/Cassandra applications
Quiz time
Summary
Chapter 7: Cassandra Partitioning, High Availability, and Consistency
Consistent hashing
Replication in Cassandra and strategies
Cassandra consistency
Quiz time
Summary
Chapter 8: Cassandra Management and Maintenance
Cassandra – gossip protocol
Cassandra cluster scaling – adding a new node
Cassandra cluster – replacing a dead node
The replication factor
The nodetool commands
Cassandra fault tolerance
Cassandra monitoring systems
Quiz time
Summary
Chapter 9: Storm Management and Maintenance
Scaling the Storm cluster – adding new supervisor nodes
Scaling the Storm cluster and rebalancing the topology
Setting up workers and parallelism to enhance processing
Storm troubleshooting
Quiz time
Summary
Chapter 10: Advance Concepts in Storm
Building a Trident topology
Understanding the Trident API
Examples and illustrations
Quiz time
Summary
Chapter 11: Distributed Cache and CEP with Storm
The need for distributed caching in Storm
Introduction to memcached
Introduction to the complex event processing engine
Quiz time
Summary

Book Details

ISBN 139781784395490
Paperback220 pages
Read More

Read More Reviews