Storm Real-time Processing Cookbook


Storm Real-time Processing Cookbook
eBook: $29.99
Formats: PDF, PacktLib, ePub and Mobi formats
$9.99
save 67%!
Print + free eBook + free PacktLib access to the book: $79.98    Print cover: $49.99
$49.99
save 37%!
Free Shipping!
UK, US, Europe and selected countries in Asia.
Also available on:
Overview
Table of Contents
Author
Support
Sample Chapters
  • Learn the key concepts of processing data in real time with Storm
  • Concepts ranging from Log stream processing to mastering data management with Storm
  • Written in a Cookbook style, with plenty of practical recipes with well-explained code examples and relevant screenshots and diagrams

Book Details

Language : English
Paperback : 254 pages [ 235mm x 191mm ]
Release Date : August 2013
ISBN : 1782164421
ISBN 13 : 9781782164425
Author(s) : Quinton Anderson
Topics and Technologies : All Books, Big Data and Business Intelligence, Cookbooks, Open Source

Table of Contents

Preface
Chapter 1: Setting Up Your Development Environment
Chapter 2: Log Stream Processing
Chapter 3: Calculating Term Importance with Trident
Chapter 4: Distributed Remote Procedure Calls
Chapter 5: Polyglot Topology
Chapter 6: Integrating Storm and Hadoop
Chapter 7: Real-time Machine Learning
Chapter 8: Continuous Delivery
Chapter 9: Storm on AWS
Index
  • Chapter 1: Setting Up Your Development Environment
    • Introduction
    • Setting up your development environment
    • Distributed version control
    • Creating a "Hello World" topology
    • Creating a Storm cluster – provisioning the machines
    • Creating a Storm cluster – provisioning Storm
    • Deriving basic click statistics
    • Unit testing a bolt
    • Implementing an integration test
    • Deploying to the cluster
    • Chapter 2: Log Stream Processing
      • Introduction
      • Creating a log agent
      • Creating the log spout
      • Rule-based analysis of the log stream
      • Indexing and persisting the log data
      • Counting and persisting log statistics
      • Creating an integration test for the log stream cluster
      • Creating a log analytics dashboard
        • Chapter 4: Distributed Remote Procedure Calls
          • Introduction
          • Using DRPC to complete the required processing
          • Integration testing of a Trident topology
          • Implementing a rolling window topology
          • Simulating time in integration testing
          • Chapter 5: Polyglot Topology
            • Introduction
            • Implementing the multilang protocol in Qt
            • Implementing the SplitSentence bolt in Qt
            • Implementing the count bolt in Ruby
            • Defining the word count topology in Clojure
              • Chapter 7: Real-time Machine Learning
                • Introduction
                • Implementing a transactional topology
                • Creating a Random Forest classification model using R
                • Operational classification of transactional streams using Random Forest
                • Creating an association rules model in R
                • Creating a recommendation engine
                • Real-time online machine learning
                • Chapter 8: Continuous Delivery
                  • Introduction
                  • Setting up a CI server
                  • Setting up system environments
                  • Defining a delivery pipeline
                  • Implementing automated acceptance testing
                  • Chapter 9: Storm on AWS
                    • Introduction
                    • Deploying Storm on AWS using Pallet
                    • Setting up a Virtual Private Cloud
                    • Deploying Storm into Virtual Private Cloud using Vagrant

                    Quinton Anderson

                    Quinton Anderson is a software engineer with a background and focus on real-time computational systems. His career has been split between building real-time communication systems for defense systems and building enterprise applications within financial services and banking. Quinton does not align himself with any particular technology or programming language, but rather prefers to focus on sound engineering and polyglot development. He is passionate about open source, and is an active member of the Storm community; he has also enjoyed delivering various Storm-based solutions. Quinton's next area of focus is machine learning; specifically, Deep Belief networks, as they pertain to robotics. Please follow his blog entries on Computational Theory, general IT concepts, and Deep Belief networks for more information. You can find more information on Quinton via his LinkedIn profile (http://au.linkedin.com/pub/quinton-anderson/37/422/11b/) or more importantly, view and contribute to the source code available at his GitHub (https://github.com/quintona) and Bitbucket (https://bitbucket.org/qanderson) accounts.
                    Sorry, we don't have any reviews for this title yet.

                    Code Downloads

                    Download the code and support files for this book.


                    Submit Errata

                    Please let us know if you have found any errors not listed on this list by completing our errata submission form. Our editors will check them and add them to this list. Thank you.


                    Errata

                    - 4 submitted: last submission 05 Feb 2014

                    Errata type: Others | Page number: 53

                    The command line code under step 3 is incorrect. The correct one is as follows:

                    java -jar logstash-1.1.7-monolithic.jar agent -f shipper.conf

                    Errata type: Technical | Page number: 34

                    At bullet point 4, "source/main/java" should be "src/main/java"

                     

                    Errata type: Typo | Page number: 64

                    [The correction is shown in bold]

                    In the intro of the Indexing and persisting the log data recipe, the sentence reads "In order to achieve this, the recipe integrates with an open source product call Elastic Search,..."

                    It should instead read "In order to achieve this, the recipe integrates with an open source product called Elastic Search,.."

                     

                    In page 23, there is a dorpbox link in the code that reads "http://dl.dropbox.com/u/1537815/precise64.box". This is currently unavailable. You can use the following link instead:

                    http://files.vagrantup.com/precise64.box

                    Sample chapters

                    You can view our sample chapters and prefaces of this title on PacktLib or download sample chapters in PDF format.

                    Frequently bought together

                    Storm Real-time Processing Cookbook +    Mastering Object-oriented Python =
                    50% Off
                    the second eBook
                    Price for both: ₨481.80

                    Buy both these recommended eBooks together and get 50% off the cheapest eBook.

                    What you will learn from this book

                    • Create a log spout
                    • Consume messages from a JMS queue
                    • Implement unidirectional synchronization based on a data stream
                    • Execute disaster recovery on a separate AWS region

                    In Detail

                    Storm is a free and open source distributed real-time computation system. Storm makes it easy to reliably process unbounded streams of data, doing for real-time processing what Hadoop did for batch processing. Storm is simple, can be used with any programming language, and is a lot of fun to use!
                    Storm Real Time Processing Cookbook will have basic to advanced recipes on Storm for real-time computation.

                    The book begins with setting up the development environment and then teaches log stream processing. This will be followed by real-time payments workflow, distributed RPC, integrating it with other software such as Hadoop and Apache Camel, and more.

                    Approach

                    A Cookbook with plenty of practical recipes for different uses of Storm.

                    Who this book is for

                    If you are a Java developer with basic knowledge of real-time processing and would like to learn Storm to process unbounded streams of data in real time, then this book is for you.

                    Code Download and Errata
                    Packt Anytime, Anywhere
                    Register Books
                    Print Upgrades
                    eBook Downloads
                    Video Support
                    Contact Us
                    Awards Voting Nominations Previous Winners
                    Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
                    Resources
                    Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software