You're reading from Automated Machine Learning

Product typeBook

Published inFeb 2021

Reading LevelBeginner

PublisherPackt

ISBN-139781800567689

Edition1st Edition

Languages

Python

Tools

Azure Functions

Concepts

Machine Learning

Author (1)

Adnan Masood

How automated ML works

ML techniques work great when it comes to finding patterns in large datasets. Today, we use these techniques for anomaly detection, customer segmentation, customer churn analysis, demand forecasting, predictive maintenance, and pricing optimization, among hundreds of other use cases.

A typical ML life cycle is comprised of data collection, data wrangling, pipeline management, model retraining, and model deployment, during which data wrangling is typically the most time-consuming task.

Extracting meaningful features out of data, and then using them to build a model while finding the right algorithm and tuning the parameters, is also a very time-consuming process. Can we automate this process using the very thing we are trying to build here (meta enough?); that is, should we automate ML? Well, that is how this all started – with someone attempting to print a 3D printer using a 3D printer.

A typical data science workflow starts with a business problem (hopefully!), and it is used to either prove a hypothesis or to discover new patterns in the existing data. It requires data; the need to clean and preprocess the data, which takes an awfully large amount of time – almost as much as 80% of your total time; and "data munging" or wrangling, which includes cleaning, de-duplication, outlier analysis and removal, transforming, mapping, structuring, and enriching. Essentially, we're taming this unwieldy vivacious raw real-world data and putting it in a tame desired format for analysis and modeling so that we can gain meaningful insights from it.

Next, we must select and engineer features, which means figuring out what features are useful, and then brainstorming and working with SMEs on the importance and validity of these features. Validating how these features would work with your model, the fitness from both a technical and business perspective, and improving these features as needed is also a critical part of the feature engineering process. The feedback loop to the SME is often very important, albeit being the least emphasized part of the feature engineering pipeline. The transparency of models stems from clear features – if features such as race or gender give higher accuracy regarding your loan repayment propensity model, this does not mean it's a good idea use them. In fact, an SME would tell you – if your conscious mind hasn't – that it's a terrible idea and that you should look for more meaningful and less sexist, racist, and xenophobic features. We will discuss this further in Chapter 10, AutoML in the Enterprise, when we discuss operationalization.

Even though the task of "selecting a model family" sounds like a reality show, that is what data scientists and ML engineers do as part of their day-to-day job. Model selection is the task of picking the right model that best describes the data at hand. This involves selecting a ML model from a set of candidate models. Automated ML can give you a helping hand with this.

Hyperparameters

You will hear about hyperparameters a lot, so let's make sure you understand what they are.

Each model has its own internal and external parameters. Internal parameters (also known as model parameters, or just parameters) are the ones intrinsic to the model, such as its weight and predictor matrix, while external parameters or hyperparameters are "outside" the model itself, such as the learning rate and its number of iterations. An intuitive example can be derived from k-means, a well-understood unsupervised clustering algorithm known for its simplicity.

The k in k-means stands for the number of clusters required, and epochs (pronounced epics, as in Doctor Who is an epic show!) are used to specify the number of passes that are done over the training data. Both of these are examples of hyperparameters – that is, the parameters that are not intrinsic to the model itself. Similarly, the learning rate for training a neural network, C and sigma for support vector machines, the k number of leaves or depth of a tree, the latent factors in a matrix factorization, and the number of hidden layers in a deep neural network are all examples of hyperparameters.

Selecting the right hyperparameters has been called tuning your instrument, which is where the magic happens. In ML tribal folklore, these elusive numbers have been brandished as "nuisance parameters", to the point where proverbial statements such as "tuning is more of an art than a science" and "tuning models is like black magic" tend to discourage newcomers in the industry. Automated ML is here to change this perception by helping you choose the right hyperparameters; more on this later. Automated ML enables citizen data scientists to build, train, and deploy ML models, thus possibly disrupting the status quo.

Important note

Some consider the term "citizen data scientists" as a euphuism for non-experts, but SME and people who are curious about analytics are some of the most important people – and don't let anyone tell you otherwise.

In conclusion, from building the correct ensembles of models to preprocessing the data, selecting the right features and model family, choosing and optimizing model hyperparameters, and evaluating the results, automated ML offers algorithmic solutions that can programmatically address these challenges.

The need for automated ML

At the time of writing, Open AI's GPT-3 model has recently been announced, and it has an incredible 175 billion parameters. Due to this ever-increasing model complexity, which includes big data and an exponentially increasing number of features, we now have a necessity to not only be able to tune these parameters, but also have sophisticated, repeatable procedures in place to tweak these proverbial knobs so that they can be adjusted. This complexity makes it less accessible for citizen data scientists, business subject matter experts, and domain experts – which might sound like job security, but it is not good for business, nor for the long-term success of the field.

Also, this isn't just about the hyperparameters, but the entire pipeline and the reproducibility of the results becoming harder as the model's complexity grows, which curtails AI democratization.

You have been reading a chapter from

Automated Machine Learning

Published in: Feb 2021Publisher: PacktISBN-13: 9781800567689

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Adnan Masood

Adnan Masood, PhD is an artificial intelligence and machine learning researcher, visiting scholar at Stanford AI Lab, software engineer, Microsoft MVP (Most Valuable Professional), and Microsoft's regional director for artificial intelligence. As chief architect of AI and machine learning at UST Global, he collaborates with Stanford AI Lab and MIT CSAIL, and leads a team of data scientists and engineers building artificial intelligence solutions to produce business value and insights that affect a range of businesses, products, and initiatives.
Read more about Adnan Masood

Other recommended products

Related to this chapter

Automated Machine Learning with AutoKeras

AutoKeras is a very simple and popular open source AutoML framework that provides easy access to deep learning models. This book will help you to explore the basics of automated machine learning using practical examples, enabling you to create and use your own models in your company or project.

BookMay 2021194 pages

Amazon SageMaker Best Practices

Going beyond the basics, Amazon SageMaker Best Practices provides end-to-end coverage of the service capabilities that the platform offers for building and automating machine learning workloads to address data science challenges. With this book, you'll discover tips to train, deploy, and monitor your machine learning solutions efficiently.

BookSep 2021348 pages

Machine Learning Engineering with MLflow

Machine Learning Engineering with MLflow is a step-by-step guide that will have you up and running, and productive in no time with MLflow using the most effective machine learning engineering approach. You will also learn how to scale MLflow in big data environments and for high computing demands.

BookAug 2021248 pages2

Automated Machine Learning with Microsoft Azure

A practical, step-by-step guide to using Microsoft's AutoML technology on the Azure Machine Learning service for developers and data scientists working with the Python programming language

BookApr 2021340 pages

Learn Amazon SageMaker

This book will teach you how to move quickly from business questions to machine learning models in production. Using real-world examples implemented with Python and Jupyter notebooks, you’ll learn about many the features and APIs of Amazon SageMaker on a wide spectrum of use cases: tabular data, computer vision, and natural language processing.

BookAug 2020490 pages

Machine Learning with BigQuery ML

This book helps you accelerate machine learning model development with BigQuery ML. Throughout the book, you'll use various ML models to learn about BigQuery ML features and discover how to apply them to different business scenarios. This book will help you to extend existing SQL capabilities to leverage the full potential of machine learning.

BookJun 2021344 pages

Hands-On Automated Machine Learning

This book helps machine learning professionals in developing AutoML systems that can be utilized to build ML solutions. This book covers the necessary foundations and shows the most practical ways possible to get to speed with regards to creating AutoML modules.

BookApr 2018282 pages

Machine Learning Automation with TPOT

If you are a developer looking to build machine learning models without spending months and years learning machine learning prerequisites, look no further than AutoML. This practical and concise guide will show you how to build automated models for regression and classification, both with traditional algorithms and neural networks.

BookMay 2021270 pages

Mastering Azure Machine Learning

This book will help you learn how to build a scalable end-to-end machine learning pipeline in Azure from experimentation and training to optimization and deployment. By the end of this book, you will learn to build complex distributed systems and scalable cloud infrastructure using powerful machine learning algorithms to compute insights.

BookApr 2020436 pages

Hands-On Artificial Intelligence on Google Cloud Platform

This book focuses on the use of powerful AI tools offered by Google Cloud Platform to develop and design intelligent applications on the cloud. You will start with topics that set the foundation for using GCP with various powerful libraries, and then move on to building end to end AI applications using them.

BookMar 2020350 pages

Learn TensorFlow Enterprise

This book is a comprehensive introduction for those who are new to scalable and optimized TensorFlow for production. You will learn how to deliver enterprise-grade support for your existing and newly built AI applications. You will address the various needs of AI-enabled organizations to manage and scale machine learning workloads in production.

BookNov 2020314 pages

Hands-On Machine Learning with Azure

This book will teach you how advanced machine learning can be performed in the cloud in a very cheap way. You will learn more about Azure ML processes as an enterprise-ready methodology. By the end of this book, you will implement machine learning and artificial intelligence concepts in your model to solve real-world problems.

BookOct 2018340 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages