Reader small image

You're reading from  Artificial Intelligence for IoT Cookbook

Product typeBook
Published inMar 2021
Reading LevelIntermediate
PublisherPackt
ISBN-139781838981983
Edition1st Edition
Languages
Right arrow
Author (1)
Michael Roshak
Michael Roshak
author image
Michael Roshak

Michael Roshak is a cloud architect and strategist with extensive subject matter expertise in enterprise cloud transformation programs and infrastructure modernization through designing, and deploying cloud-oriented solutions and architectures. He is responsible for providing strategic advisory for cloud adoption, consultative technical sales, and driving broad cloud services consumption with highly strategic accounts across multiple industries.
Read more about Michael Roshak

Right arrow

Setting up Databricks

Processing large amounts of data is not possible on a single computer. That is where distributed systems such as Spark (made by Databricks) come in. Spark allows you to parallelize large workloads over many computers. 

Spark was developed to help solve the Netflix Prize, which had a $1 million prize for the team that made the best recommendation engine. Spark uses distributed computing to wrangle large and complex datasets. There are distributed Python equivalent libraries, such as Koalas, which is a distributed equivalent of pandas. Spark also supports analytics and feature engineering that requires a large amount of compute and memory, such as graph theory problems. Spark has two modes: a batch mode for training large datasets and a streaming mode for scoring data in near real time. 

IoT data tends to be large and imbalanced. A device may have 10 years of data showing it is running in normal conditions and only a few records showing it needs...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Artificial Intelligence for IoT Cookbook
Published in: Mar 2021Publisher: PacktISBN-13: 9781838981983

Author (1)

author image
Michael Roshak

Michael Roshak is a cloud architect and strategist with extensive subject matter expertise in enterprise cloud transformation programs and infrastructure modernization through designing, and deploying cloud-oriented solutions and architectures. He is responsible for providing strategic advisory for cloud adoption, consultative technical sales, and driving broad cloud services consumption with highly strategic accounts across multiple industries.
Read more about Michael Roshak