PySpark and AWS: Master Big Data with PySpark and AWS [Video]

By AI Sciences
    What do you get with a Packt Subscription?

  • Instant access to this title and 7,500+ eBooks & Videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Free Chapter
    Introduction
About this video

The hottest buzzwords in the Big Data analytics industry are Python and Apache Spark. PySpark supports the collaboration of Python and Apache Spark. In this course, you’ll start right from the basics and proceed to the advanced levels of data analysis. From cleaning data to building features and implementing machine learning (ML) models, you’ll learn how to execute end-to-end workflows using PySpark.

Right through the course, you’ll be using PySpark to perform data analysis. You’ll explore Spark RDDs, Dataframes, and a bit of Spark SQL queries. Also, you’ll explore the transformations and actions that can be performed on the data using Spark RDDs and Dataframes. You’ll also explore the ecosystem of Spark and Hadoop and their underlying architecture. You’ll use the Databricks environment to run the Spark scripts and explore it as well.

Finally, you’ll have a taste of Spark with AWS cloud. You’ll see how we can leverage AWS storages, databases, computations, and how Spark can communicate with different AWS services and get its required data.

By the end of this course, you’ll be able to understand and implement the concepts of PySpark and AWS to solve real-world problems.

The code bundles are available here: https://github.com/PacktPublishing/PySpark-and-AWS-Master-Big-Data-with-PySpark-and-AWS

Publication date:
September 2021
Publisher
Packt
Duration
16 hours 10 minutes
ISBN
9781803236698

About the Author
  • AI Sciences

    AI Sciences is a group of experts, PhDs, and practitioners of AI, ML, computer science, and statistics. Some of the experts work in big companies such as Amazon, Google, Facebook, Microsoft, KPMG, BCG, and IBM.

    They have produced a series of courses mainly dedicated to beginners and newcomers on the techniques and methods of machine learning, statistics, artificial intelligence, and data science.

    Initially, their objective was to help only those who wish to understand these techniques more easily and to be able to start without too much theory. Today, they also publish more complete courses for a wider audience. Their courses have had phenomenal success and have helped more than 100,000 students master AI and data science.

    Browse publications by this author
PySpark and AWS: Master Big Data with PySpark and AWS [Video]
Unlock this video and the full library FREE for 7 days
Start now