Storm Real-time Processing Cookbook

Java developers can expand into real-time data processing with this fantastic guide to Storm. Using a cookbook approach with lots of practical recipes, it’s the user-friendly way to learn how to process unlimited data streams.

Storm Real-time Processing Cookbook

Cookbook
Quinton Anderson

Java developers can expand into real-time data processing with this fantastic guide to Storm. Using a cookbook approach with lots of practical recipes, it’s the user-friendly way to learn how to process unlimited data streams.
$29.99
$49.99
RRP $29.99
RRP $49.99
eBook
Print + eBook

Instantly access this course right now and get the skills you need in 2017

With unlimited access to a constantly growing library of over 4,000 eBooks and Videos, a subscription to Mapt gives you everything you need to get that next promotion or to land that dream job. Cancel anytime.

Free Sample

Book Details

ISBN 139781782164425
Paperback254 pages

Book Description

Storm is a free and open source distributed real-time computation system. Storm makes it easy to reliably process unbounded streams of data, doing for real-time processing what Hadoop did for batch processing. Storm is simple, can be used with any programming language, and is a lot of fun to use!
Storm Real Time Processing Cookbook will have basic to advanced recipes on Storm for real-time computation.

The book begins with setting up the development environment and then teaches log stream processing. This will be followed by real-time payments workflow, distributed RPC, integrating it with other software such as Hadoop and Apache Camel, and more.

Table of Contents

Chapter 1: Setting Up Your Development Environment
Introduction
Setting up your development environment
Distributed version control
Creating a "Hello World" topology
Creating a Storm cluster – provisioning the machines
Creating a Storm cluster – provisioning Storm
Deriving basic click statistics
Unit testing a bolt
Implementing an integration test
Deploying to the cluster
Chapter 2: Log Stream Processing
Introduction
Creating a log agent
Creating the log spout
Rule-based analysis of the log stream
Indexing and persisting the log data
Counting and persisting log statistics
Creating an integration test for the log stream cluster
Creating a log analytics dashboard
Chapter 3: Calculating Term Importance with Trident
Introduction
Creating a URL stream using a Twitter filter
Deriving a clean stream of terms from the documents
Calculating the relative importance of each term
Chapter 4: Distributed Remote Procedure Calls
Introduction
Using DRPC to complete the required processing
Integration testing of a Trident topology
Implementing a rolling window topology
Simulating time in integration testing
Chapter 5: Polyglot Topology
Introduction
Implementing the multilang protocol in Qt
Implementing the SplitSentence bolt in Qt
Implementing the count bolt in Ruby
Defining the word count topology in Clojure
Chapter 6: Integrating Storm and Hadoop
Introduction
Implementing TF-IDF in Hadoop
Persisting documents from Storm
Integrating the batch and real-time views
Chapter 7: Real-time Machine Learning
Introduction
Implementing a transactional topology
Creating a Random Forest classification model using R
Operational classification of transactional streams using Random Forest
Creating an association rules model in R
Creating a recommendation engine
Real-time online machine learning
Chapter 8: Continuous Delivery
Introduction
Setting up a CI server
Setting up system environments
Defining a delivery pipeline
Implementing automated acceptance testing
Chapter 9: Storm on AWS
Introduction
Deploying Storm on AWS using Pallet
Setting up a Virtual Private Cloud
Deploying Storm into Virtual Private Cloud using Vagrant

What You Will Learn

  • Create a log spout
  • Consume messages from a JMS queue
  • Implement unidirectional synchronization based on a data stream
  • Execute disaster recovery on a separate AWS region

Authors

Table of Contents

Chapter 1: Setting Up Your Development Environment
Introduction
Setting up your development environment
Distributed version control
Creating a "Hello World" topology
Creating a Storm cluster – provisioning the machines
Creating a Storm cluster – provisioning Storm
Deriving basic click statistics
Unit testing a bolt
Implementing an integration test
Deploying to the cluster
Chapter 2: Log Stream Processing
Introduction
Creating a log agent
Creating the log spout
Rule-based analysis of the log stream
Indexing and persisting the log data
Counting and persisting log statistics
Creating an integration test for the log stream cluster
Creating a log analytics dashboard
Chapter 3: Calculating Term Importance with Trident
Introduction
Creating a URL stream using a Twitter filter
Deriving a clean stream of terms from the documents
Calculating the relative importance of each term
Chapter 4: Distributed Remote Procedure Calls
Introduction
Using DRPC to complete the required processing
Integration testing of a Trident topology
Implementing a rolling window topology
Simulating time in integration testing
Chapter 5: Polyglot Topology
Introduction
Implementing the multilang protocol in Qt
Implementing the SplitSentence bolt in Qt
Implementing the count bolt in Ruby
Defining the word count topology in Clojure
Chapter 6: Integrating Storm and Hadoop
Introduction
Implementing TF-IDF in Hadoop
Persisting documents from Storm
Integrating the batch and real-time views
Chapter 7: Real-time Machine Learning
Introduction
Implementing a transactional topology
Creating a Random Forest classification model using R
Operational classification of transactional streams using Random Forest
Creating an association rules model in R
Creating a recommendation engine
Real-time online machine learning
Chapter 8: Continuous Delivery
Introduction
Setting up a CI server
Setting up system environments
Defining a delivery pipeline
Implementing automated acceptance testing
Chapter 9: Storm on AWS
Introduction
Deploying Storm on AWS using Pallet
Setting up a Virtual Private Cloud
Deploying Storm into Virtual Private Cloud using Vagrant

Book Details

ISBN 139781782164425
Paperback254 pages
Read More

Read More Reviews