Big Data Processing using Apache Spark [Video]

More Information
Learn
  • Understand Spark API and its Architecture.
  • Know the difference between RDD and DataFrame API.
  • Learn to join big amounts of data.
  • Start a project using Apache Spark.
  • Discover how to write efficient jobs using Apache Spark.
  • Test Spark code correctly
  • Leverage Apache Spark to process big data faster.
About

Every year we have a big increment of data that we need to store and analyze. When we want to aggregate all data about our users and analyze that data to find insights from it, terabytes of data undergo processing. To be able to process such amounts of data, we need to use a technology that can distribute multiple computations and make them more efficient. Apache Spark is a technology that allows us to process big data leading to faster and scalable processing.

In this course, we will learn how to leverage Apache Spark to be able to process big data quickly. We will cover the basics of Spark API and its architecture in detail. In the second section of the course, we will learn about Data Mining and Data Cleaning, wherein we will look at the Input Data Structure and how Input data is loaded In the third section we will be writing actual jobs that analyze data. By the end of the course, you will have sound understanding of the Spark framework which will help you in writing the code understand the processing of big data.

Style and Approach

Filled with hands-on examples, this course will help you learn how to process big data using Apache.

Features
  • Explore the Apache Spark Architecture and delve into its API and key features
  • Implement Efficient Big Data Processing using this framework
  • Write Code that is Maintainable and easy to Test
Course Length 1 hour 24 minutes
ISBN 9781788398367
Date Of Publication 28 May 2017

Authors

Tomasz Lelek

Tomasz Lelek is a software engineer who programs mostly in Java and Scala. He has worked with the core Java language for the past six years. He has developed multiple production Java software projects that work in a reactive way. He is passionate about nearly everything associated with software development and believes that we should always try to consider different solutions and approaches before solving a problem. Recently, he was a speaker at conferences in Poland, at JDD (Java Developers Day), and at Krakow Scala User Group. He has also conducted a live coding session at Geecon Conference. He is a co-founder of initLearn, an e-learning platform that was built with the Java language. He has also written articles about everything related to the Java world.