Hands-On Machine Learning with Scala and Spark [Video]

More Information
  • Extract features from data
  • Write Scala code implementing ML algorithms for prediction and clustering 
  • Analyze the structure of datasets with exploratory data analysis techniques using Scala. 
  • Get to grips with the most popular machine learning algorithms used in the areas of regression, classification, clustering, dimensionality reduction, PCA, and neuralnetworks. 
  • Use the power of MLlib libraries to implement machine learning with Spark
  • Using GMM to reason about time series data
  • Work with the k-means and Naive Bayes algorithms and their methods and implement them in Scala with real datasets

Programmers face multiple challenges while implementing ML; dealing with unstructured data and picking the proper ML model are among the hardest.

In this course we will go through day-to-day challenges that programmers face when implementing ML pipelines and consider different approaches and models to solve complex problems.

You will learn about the most effective machine learning techniques and implement them in your favor. You will implement algorithms in practical hands-on projects, building data models and understanding how they work by using different types of algorithm.

Each section of the course deals with a specific machine learning problem and analysis and gives you insights by using real-world datasets.

By the end of this course, you will be able to take huge datasets, extract features from it, and apply a machine learning model that is well suited to your problem.

The code bundle for the course is available at: https://github.com/PacktPublishing/Hands-On-Machine-Learning-with-Scala-and-Spark

Style and Approach

This is a step-by-step and fast-paced guide that will help you learn how to create a ML model using the Apache Spark ML toolkit. With this practical approach, you will take your skills to the next level and will be able to create ML pipelines effectively.

  • Learn how to extract ML features from unstructured data for input to ML models so you can build models for any input data
  • Leverage Spark's Powerful ML toolkit to build models by learning how to choose the best model for your problem
  • Use Deep Learning methods with Apache Spark to stay on the cutting edge of ML techniques
Course Length 1 hour 42 minutes
ISBN 9781789342468
Date Of Publication 30 Jan 2019


Tomasz Lelek

Tomasz Lelek is a software engineer, programming mostly in Java and Scala. He has been working with the Spark and ML APIs for the past 6 years, with production experience in processing petabytes of data. He is passionate about nearly everything associated with software development and believes that we should always try to consider different solutions and approaches before attempting to solve a problem. Recently, he was also a speaker at conferences in Poland—Confitura, and JDD (Java Developers Day) and at Krakow Scala User Group. He has also conducted a live coding session at the Geecon Conference.