Reader small image

You're reading from  Big Data Analytics with Java

Product typeBook
Published inJul 2017
Reading LevelIntermediate
PublisherPackt
ISBN-139781787288980
Edition1st Edition
Languages
Concepts
Right arrow
Author (1)
RAJAT MEHTA
RAJAT MEHTA
author image
RAJAT MEHTA

The author is a VP (Technical Architect) in technology in JP Morgan Chase in New York. The author is a sun certified java developer and has worked on java related technologies for more than 16 years. Current role for the past few years heavily involves the usage of bid data stack and running analytics on it. Author is also a contributor in various open source projects that are available on his GitHub repository and is also a frequent write on dev magazines.
Read more about RAJAT MEHTA

Right arrow

Chapter 12. Real-Time Analytics on Big Data

At some point in time we might all have used insurance quotes. To get insurance quotes for a car we fill in the details about us and based on our credit history and other details the application gives you the insurance quotes in real time. This application analyzes your data in real time and based on it predicts the quotes. For years, these applications have followed mostly rule-based approaches with a powerful rule engine running behind the scenes, more recently these applications have started using machine learning to analyze data further and make predictions at that point in time. All these predictions and analysis that happen at that instance or point in time are real-time analytics. Some of the most popular websites, such as Netflix or famous ad networks, are all using real-time analytics and with the coming of new devices as part of the Internet of things or IoT wave, collection and analysis of data in real time has become the need of the...

Real-time analytics


As is evident from the name, real-time analytics provides analysis and their results in real time. Big data has mostly been used in batch mode where the queries on top of the data run for a long time and the result is later analysed. The approach is changing lately, mainly due to the new requirements pertaining to certain use cases that require immediate results. Real-time requires a separate set of architecture that caters to not only data collection and data parsing, but also data analyzing at the same time.

Let's try to understand the concept of real-time analytics using the following diagram:

As you can see, today the sources of data are plenty whether it's mobile devices, websites, third-party applications, or even the Internet of Things (sensors). All this data needs a way to propagate and flow from the source of their devices to the central unit where the data can be parsed, cleaned, and finally ingested. It is at this ingestion time that the data can also be analyzed...

Summary


In this chapter, we learnt about real-time analytics and saw how big data can be used in real-time analytics apart from batch processing too. We introduced the product Impala that can be used to fire fast SQL queries on big data which is usually stored in Parquet format in HDFS. While looking at Impala we briefly did a simple case study on flight analytics using Impala. We later covered Apache Kafka a messaging product that can be used in conjunction with big data technologies and build real time data stacks. Kafka is a scalable messaging solution and we showed how it can be integrated with Spark Streaming module of Apache Spark. Spark Streaming let's you collect data in mini batches in real time and it calls sequence of these mini batches as streams. Spark Streaming is becoming very popular these days as it is a good scalable solution that fits into the needs of many users. We finally covered a few cases studies using Apache Kafka and Spark Streaming and showed how complex use cases...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Big Data Analytics with Java
Published in: Jul 2017Publisher: PacktISBN-13: 9781787288980
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
RAJAT MEHTA

The author is a VP (Technical Architect) in technology in JP Morgan Chase in New York. The author is a sun certified java developer and has worked on java related technologies for more than 16 years. Current role for the past few years heavily involves the usage of bid data stack and running analytics on it. Author is also a contributor in various open source projects that are available on his GitHub repository and is also a frequent write on dev magazines.
Read more about RAJAT MEHTA