Learning Storm

Create real-time stream processing applications with Apache Storm
Preview in Mapt

Learning Storm

Ankit Jain, Anand Nalya

Create real-time stream processing applications with Apache Storm
Mapt Subscription
FREE
$29.99/m after trial
eBook
$16.80
RRP $23.99
Save 29%
Print + eBook
$39.99
RRP $39.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$0.00
$16.80
$39.99
$29.99p/m after trial
RRP $23.99
RRP $39.99
Subscription
eBook
Print + eBook
Start 30 Day Trial

Frequently bought together


Learning Storm Book Cover
Learning Storm
$ 23.99
$ 16.80
Mastering Apache Storm Book Cover
Mastering Apache Storm
$ 39.99
$ 28.00
Buy 2 for $34.30
Save $29.68
Add to Cart
Subscribe and access every Packt eBook & Video.
 
  • 5,000+ eBooks & Videos
  • 50+ New titles a month
  • 1 Free eBook/Video to keep every month
Start Free Trial
 

Book Details

ISBN 139781783981328
Paperback252 pages

Book Description

Starting with the very basics of Storm, you will learn how to set up Storm on a single machine and move on to deploying Storm on your cluster. You will understand how Kafka can be integrated with Storm using the Kafka spout.

You will then proceed to explore the Trident abstraction tool with Storm to perform stateful stream processing, guaranteeing single message processing in every topology. You will move ahead to learn how to integrate Hadoop with Storm. Next, you will learn how to integrate Storm with other well-known Big Data technologies such as HBase, Redis, and Kafka to realize the full potential of Storm.

Finally, you will perform in-depth case studies on Apache log processing and machine learning with a focus on Storm, and through these case studies, you will discover Storm's realm of possibilities.

Table of Contents

Chapter 1: Setting Up Storm on a Single Machine
Features of Storm
Storm components
The Storm data model
Summary
Chapter 2: Setting Up a Storm Cluster
Setting up a distributed Storm cluster
Deploying a topology on a remote Storm cluster
Configuring the parallelism of a topology
Rebalancing the parallelism of a topology
Stream grouping
Guaranteed message processing
Summary
Chapter 3: Monitoring the Storm Cluster
Starting to use the Storm UI
Monitoring a topology using the Storm UI
Cluster statistics using the Nimbus thrift client
Summary
Chapter 4: Storm and Kafka Integration
The Kafka architecture
Setting up Kafka
A sample Kafka producer
Integrating Kafka with Storm
Summary
Chapter 5: Exploring High-level Abstraction in Storm with Trident
Introducing Trident
Understanding Trident's data model
Writing Trident functions, filters, and projections
Trident repartitioning operations
Trident aggregators
Utilizing the groupBy operation
A non-transactional topology
A sample Trident topology
Maintaining the topology state with Trident
A transactional topology
The opaque transactional topology
Distributed RPC
When to use Trident
Summary
Chapter 6: Integration of Storm with Batch Processing Tools
Exploring Apache Hadoop
Installing Apache Hadoop
Integration of Storm with Hadoop
Deploying Storm-Starter topologies on Storm-YARN
Summary
Chapter 7: Integrating Storm with JMX, Ganglia, HBase, and Redis
Monitoring the Storm cluster using JMX
Monitoring the Storm cluster using Ganglia
Integrating Storm with HBase
Integrating Storm with Redis
Summary
Chapter 8: Log Processing with Storm
Server log-processing elements
Producing the Apache log in Kafka
Splitting the server log line
Identifying the country, the operating system type, and the browser type from the logfile
Extracting the searched keyword
Persisting the process data
Defining a topology and the Kafka spout
Deploying a topology
MySQL queries
Summary
Chapter 9: Machine Learning
Exploring machine learning
Using Trident-ML
The use case – clustering synthetic control data
Producing a training dataset into Kafka
Building a Trident topology to build the clustering model
Summary

What You Will Learn

  • Learn the core concepts of Apache Storm and real-time processing
  • Deploy Storm in the local and clustered modes
  • Design and develop Storm topologies to solve real-world problems
  • Read data from external sources such as Apache Kafka for processing in Storm and store the output into HBase and Redis
  • Create Trident topologies to support various message-processing semantics
  • Monitor the health of a Storm cluster

Authors

Table of Contents

Chapter 1: Setting Up Storm on a Single Machine
Features of Storm
Storm components
The Storm data model
Summary
Chapter 2: Setting Up a Storm Cluster
Setting up a distributed Storm cluster
Deploying a topology on a remote Storm cluster
Configuring the parallelism of a topology
Rebalancing the parallelism of a topology
Stream grouping
Guaranteed message processing
Summary
Chapter 3: Monitoring the Storm Cluster
Starting to use the Storm UI
Monitoring a topology using the Storm UI
Cluster statistics using the Nimbus thrift client
Summary
Chapter 4: Storm and Kafka Integration
The Kafka architecture
Setting up Kafka
A sample Kafka producer
Integrating Kafka with Storm
Summary
Chapter 5: Exploring High-level Abstraction in Storm with Trident
Introducing Trident
Understanding Trident's data model
Writing Trident functions, filters, and projections
Trident repartitioning operations
Trident aggregators
Utilizing the groupBy operation
A non-transactional topology
A sample Trident topology
Maintaining the topology state with Trident
A transactional topology
The opaque transactional topology
Distributed RPC
When to use Trident
Summary
Chapter 6: Integration of Storm with Batch Processing Tools
Exploring Apache Hadoop
Installing Apache Hadoop
Integration of Storm with Hadoop
Deploying Storm-Starter topologies on Storm-YARN
Summary
Chapter 7: Integrating Storm with JMX, Ganglia, HBase, and Redis
Monitoring the Storm cluster using JMX
Monitoring the Storm cluster using Ganglia
Integrating Storm with HBase
Integrating Storm with Redis
Summary
Chapter 8: Log Processing with Storm
Server log-processing elements
Producing the Apache log in Kafka
Splitting the server log line
Identifying the country, the operating system type, and the browser type from the logfile
Extracting the searched keyword
Persisting the process data
Defining a topology and the Kafka spout
Deploying a topology
MySQL queries
Summary
Chapter 9: Machine Learning
Exploring machine learning
Using Trident-ML
The use case – clustering synthetic control data
Producing a training dataset into Kafka
Building a Trident topology to build the clustering model
Summary

Book Details

ISBN 139781783981328
Paperback252 pages
Read More

Read More Reviews

Recommended for You

Machine Learning with Spark Book Cover
Machine Learning with Spark
$ 29.99
$ 3.00
Practical Data Analysis Book Cover
Practical Data Analysis
$ 29.99
$ 21.00
Storm Real-time Processing Cookbook Book Cover
Storm Real-time Processing Cookbook
$ 29.99
$ 21.00
Big Data Analytics with R and Hadoop Book Cover
Big Data Analytics with R and Hadoop
$ 29.99
$ 21.00
Building Machine Learning Systems with Python Book Cover
Building Machine Learning Systems with Python
$ 29.99
$ 6.00
Machine Learning with R Book Cover
Machine Learning with R
$ 32.99
$ 23.10