Apache Spark 2 for Beginners [Video]

Apache Spark 2 for Beginners [Video]

This video is included in a Mapt subscription
Rajanarayanan Thottuvaikkatumana

Take your first steps in developing large-scale distributed data processing applications using Apache Spark 2
$10.00
RRP $124.99
Preview in Mapt

Video Details

ISBN 139781787281004
Course Length5 hours 38 minutes

Video Description

Spark is one of the most widely-used large-scale data processing engines and runs extremely fast. It is a framework that has tools that are equally useful for application developers as well as data scientists.This book starts with the fundamentals of Spark 2 and covers the core data processing framework and API, installation, and application development setup. Then the Spark programming model is introduced through real-world examples followed by Spark SQL programming with DataFrames. An introduction to SparkR is covered next. Later, we cover the charting and plotting features of Python in conjunction with Spark data processing. After that, we take a look at Spark's stream processing, machine learning, and graph processing libraries. The last chapter combines all the skills you learned from the preceding chapters to develop a real-world Spark application.By the end of this video, you will be able to consolidate data processing, stream processing, machine learning, and graph processing into one unified and highly interoperable framework with a uniform API using Scala or Python.

Style and Approach

Learn about Spark's infrastructure with this practical tutorial. With the help of real-world use cases on the main features of Spark, we offer an easy introduction to the framework.

Table of Contents

Spark Fundamentals
The Course Overview
An Overview of Apache Hadoop
Understanding Apache Spark
Installing Spark on Your Machines
Spark Programming Model
Functional Programming with Spark and Understanding Spark RDD
Data Transformations and Actions with RDDs
Monitoring with Spark
The Basics of Programming with Spark
Creating RDDs from Files and Understanding the Spark Library Stack
Spark SQL
Understanding the Structure of Data and the Need of Spark SQL
Anatomy of Spark SQL
DataFrame Programming
Understanding Aggregations and Multi-Datasource Joining with SparkSQL
Introducing Datasets and Understanding Data Catalogs
Spark Programming with R
The Need for Spark and the Basics of the R Language
DataFrames in R and Spark
Spark DataFrame Programming with R
Understanding Aggregations and Multi- Datasource Joins in SparkR
Spark Data Analysis with Python
Charting and Plotting Libraries and Setting Up a Dataset
Charts, Plots, and Histograms
Bar Chart and Pie Chart
Scatter Plot and Line Graph
Spark Stream Processing
Data Stream Processing and Micro Batch Data Processing
A Log Event Processor
Windowed Data Processing and More Processing Options
Kafka Stream Processing
Spark Streaming Jobs in Production
Spark Machine Learning
Understanding Machine Learning and the Need of Spark for it
Wine Quality Prediction and Model Persistence
Wine Classification
Spam Filtering
Feature Algorithms and Finding Synonyms
Spark Graph Processing
Understanding Graphs with Their Usage
The Spark GraphX Library
Graph Processing and Graph Structure Processing
Tennis Tournament Analysis
Applying PageRank Algorithm
Connected Component Algorithm
Understanding GraphFrames and Its Queries
Designing Spark Applications
Lambda Architecture
Micro Blogging with Lambda Architecture
Implementing Lambda Architecture and Working with Spark Applications
Coding Style, Setting Up the Source Code, and Understanding Data Ingestion
Generating Purposed Views and Queries
Understanding Custom Data Processes

What You Will Learn

  • Get to know the fundamentals of Spark 2.0 and the Spark programming model using Scala and Python
  • Know how to use Spark SQL and DataFrames using Scala and Python
  • Get an introduction to Spark programming using R
  • Perform Spark data processing, charting, and plotting using Python
  • Get acquainted with Spark stream processing using Scala and Python
  • Be introduced to machine learning with Spark using Scala and Python
  • Get started with graph processing with Spark using Scala
  • Develop a complete Spark application

Authors

Table of Contents

Spark Fundamentals
The Course Overview
An Overview of Apache Hadoop
Understanding Apache Spark
Installing Spark on Your Machines
Spark Programming Model
Functional Programming with Spark and Understanding Spark RDD
Data Transformations and Actions with RDDs
Monitoring with Spark
The Basics of Programming with Spark
Creating RDDs from Files and Understanding the Spark Library Stack
Spark SQL
Understanding the Structure of Data and the Need of Spark SQL
Anatomy of Spark SQL
DataFrame Programming
Understanding Aggregations and Multi-Datasource Joining with SparkSQL
Introducing Datasets and Understanding Data Catalogs
Spark Programming with R
The Need for Spark and the Basics of the R Language
DataFrames in R and Spark
Spark DataFrame Programming with R
Understanding Aggregations and Multi- Datasource Joins in SparkR
Spark Data Analysis with Python
Charting and Plotting Libraries and Setting Up a Dataset
Charts, Plots, and Histograms
Bar Chart and Pie Chart
Scatter Plot and Line Graph
Spark Stream Processing
Data Stream Processing and Micro Batch Data Processing
A Log Event Processor
Windowed Data Processing and More Processing Options
Kafka Stream Processing
Spark Streaming Jobs in Production
Spark Machine Learning
Understanding Machine Learning and the Need of Spark for it
Wine Quality Prediction and Model Persistence
Wine Classification
Spam Filtering
Feature Algorithms and Finding Synonyms
Spark Graph Processing
Understanding Graphs with Their Usage
The Spark GraphX Library
Graph Processing and Graph Structure Processing
Tennis Tournament Analysis
Applying PageRank Algorithm
Connected Component Algorithm
Understanding GraphFrames and Its Queries
Designing Spark Applications
Lambda Architecture
Micro Blogging with Lambda Architecture
Implementing Lambda Architecture and Working with Spark Applications
Coding Style, Setting Up the Source Code, and Understanding Data Ingestion
Generating Purposed Views and Queries
Understanding Custom Data Processes

Video Details

ISBN 139781787281004
Course Length5 hours 38 minutes
Read More

Read More Reviews