Reader small image

You're reading from  Machine Learning Engineering on AWS

Product typeBook
Published inOct 2022
PublisherPackt
ISBN-139781803247595
Edition1st Edition
Tools
Right arrow
Author (1)
Joshua Arvin Lat
Joshua Arvin Lat
author image
Joshua Arvin Lat

Joshua Arvin Lat is the Chief Technology Officer (CTO) of NuWorks Interactive Labs, Inc. He previously served as the CTO for three Australian-owned companies and as director of software development and engineering for multiple e-commerce start-ups in the past. Years ago, he and his team won first place in a global cybersecurity competition with their published research paper. He is also an AWS Machine Learning Hero and has shared his knowledge at several international conferences, discussing practical strategies on machine learning, engineering, security, and management.
Read more about Joshua Arvin Lat

Right arrow

Getting started with data processing and analysis

In the previous chapter, we utilized a data warehouse and a data lake to store, manage, and query our data. Data stored in these data sources generally must undergo a series of data processing and data transformation steps similar to those shown in Figure 5.1 before it can be used as a training dataset for ML experiments:

Figure 5.1 – Data processing and analysis

In Figure 5.1, we can see that these data processing steps may involve merging different datasets, along with cleaning, converting, analyzing, and transforming the data using a variety of options and techniques. In practice, data scientists and ML engineers generally spend a lot of hours cleaning the data and getting it ready for use in ML experiments. Some professionals may be used to writing and running custom Python or R scripts to perform this work. However, it may be more practical to use no-code or low-code solutions such as AWS Glue DataBrew...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Machine Learning Engineering on AWS
Published in: Oct 2022Publisher: PacktISBN-13: 9781803247595

Author (1)

author image
Joshua Arvin Lat

Joshua Arvin Lat is the Chief Technology Officer (CTO) of NuWorks Interactive Labs, Inc. He previously served as the CTO for three Australian-owned companies and as director of software development and engineering for multiple e-commerce start-ups in the past. Years ago, he and his team won first place in a global cybersecurity competition with their published research paper. He is also an AWS Machine Learning Hero and has shared his knowledge at several international conferences, discussing practical strategies on machine learning, engineering, security, and management.
Read more about Joshua Arvin Lat