Reader small image

You're reading from  Artificial Intelligence with Python - Second Edition

Product typeBook
Published inJan 2020
Reading LevelBeginner
PublisherPackt
ISBN-139781839219535
Edition2nd Edition
Languages
Right arrow
Author (1)
Prateek Joshi
Prateek Joshi
author image
Prateek Joshi

Prateek Joshi is the founder of Plutoshift and a published author of 9 books on Artificial Intelligence. He has been featured on Forbes 30 Under 30, NBC, Bloomberg, CNBC, TechCrunch, and The Business Journals. He has been an invited speaker at conferences such as TEDx, Global Big Data Conference, Machine Learning Developers Conference, and Silicon Valley Deep Learning. Apart from Artificial Intelligence, some of the topics that excite him are number theory, cryptography, and quantum computing. His greater goal is to make Artificial Intelligence accessible to everyone so that it can impact billions of people around the world.
Read more about Prateek Joshi

Right arrow

Data preparation

The next step is a data transformation tier that processes the raw data; some of the transformations that need to be done are:

  • Data Cleansing
  • Filtration
  • Aggregation
  • Augmentation
  • Consolidation
  • Storage

The cloud providers have become the major data science platforms. Some of the most popular stacks are built around:

  • Azure ML service
  • AWS SageMaker
  • GCP Cloud ML Engine
  • SAS
  • RapidMiner
  • Knime

One of the most popular tools to perform these transformations is Apache Spark, but it still needs a data store. For persistence, the most common solutions are:

  • Hadoop Distributed File System (HDFS)
  • HBase
  • Apache Cassandra
  • Amazon S3
  • Azure Blob Storage

It's also possible to process data for machine learning in-place, inside the database; databases like SQL Server and SQL Azure are adding specific machine learning functionality to support machine learning pipelines. Spark has that...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Artificial Intelligence with Python - Second Edition
Published in: Jan 2020Publisher: PacktISBN-13: 9781839219535

Author (1)

author image
Prateek Joshi

Prateek Joshi is the founder of Plutoshift and a published author of 9 books on Artificial Intelligence. He has been featured on Forbes 30 Under 30, NBC, Bloomberg, CNBC, TechCrunch, and The Business Journals. He has been an invited speaker at conferences such as TEDx, Global Big Data Conference, Machine Learning Developers Conference, and Silicon Valley Deep Learning. Apart from Artificial Intelligence, some of the topics that excite him are number theory, cryptography, and quantum computing. His greater goal is to make Artificial Intelligence accessible to everyone so that it can impact billions of people around the world.
Read more about Prateek Joshi