Practical Real-time Data Processing and Analytics

More Information
Learn
  • Get an introduction to the established real-time stack
  • Understand the key integration of all the components
  • Get a thorough understanding of the basic building blocks for real-time solution designing
  • Garnish the search and visualization aspects for your real-time solution
  • Get conceptually and practically acquainted with real-time analytics
  • Be well equipped to apply the knowledge and create your own solutions
About

With the rise of Big Data, there is an increasing need to process large amounts of data continuously, with a shorter turnaround time. Real-time data processing involves continuous input, processing and output of data, with the condition that the time required for processing is as short as possible.

This book covers the majority of the existing and evolving open source technology stack for real-time processing and analytics. You will get to know about all the real-time solution aspects, from the source to the presentation to persistence. Through this practical book, you’ll be equipped with a clear understanding of how to solve challenges on your own.

We’ll cover topics such as how to set up components, basic executions, integrations, advanced use cases, alerts, and monitoring. You’ll be exposed to the popular tools used in real-time processing today such as Apache Spark, Apache Flink, and Storm. Finally, you will put your knowledge to practical use by implementing all of the techniques in the form of a practical, real-world use case.

By the end of this book, you will have a solid understanding of all the aspects of real-time data processing and analytics, and will know how to deploy the solutions in production environments in the best possible manner.

Features
  • Learn about the various challenges in real-time data processing and use the right tools to overcome them
  • This book covers popular tools and frameworks such as Spark, Flink, and Apache Storm to solve all your distributed processing problems
  • A practical guide filled with examples, tips, and tricks to help you perform efficient Big Data processing in real-time
Page Count 360
Course Length 10 hours 48 minutes
ISBN 9781787281202
Date Of Publication 27 Sep 2017
What is big data?
Big data infrastructure
Real–time analytics – the myth and the reality
Near real–time solution – an architecture that works
Lambda architecture – analytics possibilities
IOT – thoughts and possibilities
Cloud – considerations for NRT and IOT
Summary
The NRT system and its building blocks
NRT – high-level system view
NRT – technology view
Summary
Spark – packaging and API
RDD pragmatic exploration
Shared variables – broadcast variables and accumulators
Summary

Authors

Shilpi Saxena

Shilpi Saxena is an IT professional and also a technology evangelist. She is an engineer who has had exposure to various domains (machine to machine space, healthcare, telecom, hiring, and manufacturing). She has experience in all the aspects of conception and execution of enterprise solutions. She has been architecting, managing, and delivering solutions in the big data space for the last 3 years. She also handles a high-performance and geographically-distributed team of elite engineers.

Shilpi has more than 12 years (3 years in the big data space) of experience in the development and execution of various facets of enterprise solutions both in the products and services dimensions of the software industry. An engineer by degree and profession, she has worn varied hats, such as developer, technical leader, product owner, tech manager, and so on, and she has seen all the flavors that the industry has to offer. She has architected and worked through some of the pioneers' production implementations in Big Data on Storm and Impala with auto-scaling in AWS.

Shilpi also authored Real-time Analytics with Storm and Cassandra with Packt Publishing.

Saurabh Gupta

Saurabh Gupta is an software engineer who has worked aspects of software requirements, designing, execution, and delivery. Saurabh has more than 3 years of experience working in Big Data domain. Saurabh is handling and designing real time as well as batch processing projects running in production including technologies like Impala, Storm, NiFi, Kafka and deployment on AWS using Docker. Saurabh also worked in product development and delivery.

Saurabh has total 10 years (3+ years in Big Data) rich experience in IT industry. Saurabh has exposure in various IOT use-cases including Telecom, HealthCare, Smart city, Smart cars and so on.