Storm Real-time Processing Cookbook

Java developers can expand into real-time data processing with this fantastic guide to Storm. Using a cookbook approach with lots of practical recipes, it’s the user-friendly way to learn how to process unlimited data streams.
Preview in Mapt

Storm Real-time Processing Cookbook

Quinton Anderson

Java developers can expand into real-time data processing with this fantastic guide to Storm. Using a cookbook approach with lots of practical recipes, it’s the user-friendly way to learn how to process unlimited data streams.
Mapt Subscription
FREE
$29.99/m after trial
eBook
$21.00
RRP $29.99
Save 29%
Print + eBook
$49.99
RRP $49.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$0.00
$21.00
$49.99
$29.99 p/m after trial
RRP $29.99
RRP $49.99
Subscription
eBook
Print + eBook
Start 30 Day Trial

Frequently bought together


Storm Real-time Processing Cookbook Book Cover
Storm Real-time Processing Cookbook
$ 29.99
$ 21.00
Practical Real-time Data Processing and Analytics Book Cover
Practical Real-time Data Processing and Analytics
$ 39.99
$ 28.00
Buy 2 for $35.00
Save $34.98
Add to Cart

Book Details

ISBN 139781782164425
Paperback254 pages

Book Description

Storm is a free and open source distributed real-time computation system. Storm makes it easy to reliably process unbounded streams of data, doing for real-time processing what Hadoop did for batch processing. Storm is simple, can be used with any programming language, and is a lot of fun to use!
Storm Real Time Processing Cookbook will have basic to advanced recipes on Storm for real-time computation.

The book begins with setting up the development environment and then teaches log stream processing. This will be followed by real-time payments workflow, distributed RPC, integrating it with other software such as Hadoop and Apache Camel, and more.

Table of Contents

Chapter 1: Setting Up Your Development Environment
Introduction
Setting up your development environment
Distributed version control
Creating a "Hello World" topology
Creating a Storm cluster – provisioning the machines
Creating a Storm cluster – provisioning Storm
Deriving basic click statistics
Unit testing a bolt
Implementing an integration test
Deploying to the cluster
Chapter 2: Log Stream Processing
Introduction
Creating a log agent
Creating the log spout
Rule-based analysis of the log stream
Indexing and persisting the log data
Counting and persisting log statistics
Creating an integration test for the log stream cluster
Creating a log analytics dashboard
Chapter 3: Calculating Term Importance with Trident
Introduction
Creating a URL stream using a Twitter filter
Deriving a clean stream of terms from the documents
Calculating the relative importance of each term
Chapter 4: Distributed Remote Procedure Calls
Introduction
Using DRPC to complete the required processing
Integration testing of a Trident topology
Implementing a rolling window topology
Simulating time in integration testing
Chapter 5: Polyglot Topology
Introduction
Implementing the multilang protocol in Qt
Implementing the SplitSentence bolt in Qt
Implementing the count bolt in Ruby
Defining the word count topology in Clojure
Chapter 6: Integrating Storm and Hadoop
Introduction
Implementing TF-IDF in Hadoop
Persisting documents from Storm
Integrating the batch and real-time views
Chapter 7: Real-time Machine Learning
Introduction
Implementing a transactional topology
Creating a Random Forest classification model using R
Operational classification of transactional streams using Random Forest
Creating an association rules model in R
Creating a recommendation engine
Real-time online machine learning
Chapter 8: Continuous Delivery
Introduction
Setting up a CI server
Setting up system environments
Defining a delivery pipeline
Implementing automated acceptance testing
Chapter 9: Storm on AWS
Introduction
Deploying Storm on AWS using Pallet
Setting up a Virtual Private Cloud
Deploying Storm into Virtual Private Cloud using Vagrant

What You Will Learn

  • Create a log spout
  • Consume messages from a JMS queue
  • Implement unidirectional synchronization based on a data stream
  • Execute disaster recovery on a separate AWS region

Authors

Table of Contents

Chapter 1: Setting Up Your Development Environment
Introduction
Setting up your development environment
Distributed version control
Creating a "Hello World" topology
Creating a Storm cluster – provisioning the machines
Creating a Storm cluster – provisioning Storm
Deriving basic click statistics
Unit testing a bolt
Implementing an integration test
Deploying to the cluster
Chapter 2: Log Stream Processing
Introduction
Creating a log agent
Creating the log spout
Rule-based analysis of the log stream
Indexing and persisting the log data
Counting and persisting log statistics
Creating an integration test for the log stream cluster
Creating a log analytics dashboard
Chapter 3: Calculating Term Importance with Trident
Introduction
Creating a URL stream using a Twitter filter
Deriving a clean stream of terms from the documents
Calculating the relative importance of each term
Chapter 4: Distributed Remote Procedure Calls
Introduction
Using DRPC to complete the required processing
Integration testing of a Trident topology
Implementing a rolling window topology
Simulating time in integration testing
Chapter 5: Polyglot Topology
Introduction
Implementing the multilang protocol in Qt
Implementing the SplitSentence bolt in Qt
Implementing the count bolt in Ruby
Defining the word count topology in Clojure
Chapter 6: Integrating Storm and Hadoop
Introduction
Implementing TF-IDF in Hadoop
Persisting documents from Storm
Integrating the batch and real-time views
Chapter 7: Real-time Machine Learning
Introduction
Implementing a transactional topology
Creating a Random Forest classification model using R
Operational classification of transactional streams using Random Forest
Creating an association rules model in R
Creating a recommendation engine
Real-time online machine learning
Chapter 8: Continuous Delivery
Introduction
Setting up a CI server
Setting up system environments
Defining a delivery pipeline
Implementing automated acceptance testing
Chapter 9: Storm on AWS
Introduction
Deploying Storm on AWS using Pallet
Setting up a Virtual Private Cloud
Deploying Storm into Virtual Private Cloud using Vagrant

Book Details

ISBN 139781782164425
Paperback254 pages
Read More

Read More Reviews

Recommended for You

Storm Blueprints: Patterns for Distributed Real-time Computation Book Cover
Storm Blueprints: Patterns for Distributed Real-time Computation
$ 29.99
$ 21.00
Real-time Analytics with Storm and Cassandra Book Cover
Real-time Analytics with Storm and Cassandra
$ 35.99
$ 25.20
Learning Storm Book Cover
Learning Storm
$ 23.99
$ 16.80
HBase High Performance Cookbook Book Cover
HBase High Performance Cookbook
$ 39.99
$ 28.00
Apache Kafka Cookbook Book Cover
Apache Kafka Cookbook
$ 23.99
$ 16.80
Building Python Real-Time Applications with Storm Book Cover
Building Python Real-Time Applications with Storm
$ 23.99
$ 16.80