Real Time Streaming using Apache Spark Streaming [Video]

More Information
Learn
  • Implement stream processing using Apache Spark Streaming
  • Consume events from the source (for instance, Kafka), apply logic on it, and send it to a data sink.
  • Understand how to deduplicate events when you have a system that ensures at-least-once deliver.
  • Learn to tackle common stream processing problems.
  • Create a job to analyze data in real time using the Apache Spark Streaming API.
  • Master event time and processing time
  • Single event processing and the micro-batch approach to processing events
  • Learn to sort infinite event streams
About

Spark is the technology that allows us to perform big data processing in the MapReduce paradigm very rapidly, due to performing the processing in memory without the need for extensive I/O operations.

Recently, the streaming approach to processing events in near real time became more widely adopted and more necessary. In this course, you will learn how to handle big amount of unbounded infinite streams of data. You will analyze data and draw conclusions from it. Furthermore, we will look at common problems when processing event streams: sorting, watermarks, deduplication, and keeping state (for example, user sessions). You will also implement streaming processing using Spark Streaming and analyze traffic on a web page in real time.

Style and Approach

This course promotes a practical approach to dealing with large amounts of online, unbounded data and drawing conclusions from it. You will implement streaming logic to handle huge amount of infinite streams of data.

Features
  • Understand the concepts and problems behind streaming processing.
  • Get to know the Apache Spark Streaming API and create jobs that analyze data in near real time.
  • Analyze traffic in real time using Spark Streaming on a web page while users are browsing it.
Course Length 59 minutes
ISBN 9781788391528
Date Of Publication 28 Jun 2017

Authors

Tomasz Lelek

Tomasz Lelek is a software engineer who programs mostly in Java and Scala. He has worked with the core Java language for the past six years. He has developed multiple production Java software projects that work in a reactive way.

He is passionate about nearly everything associated with software development and believes that we should always try to consider different solutions and approaches before solving a problem. Recently, he was a speaker at conferences in Poland, at JDD (Java Developers Day), and at Krakow Scala User Group. He has also conducted a live coding session at Geecon Conference.

He is a co-founder of initLearn, an e-learning platform that was built with the Java language.

He has also written articles about everything related to the Java world.