You're reading from The AI Product Manager's Handbook

Product typeBook

Published inFeb 2023

Reading LevelIntermediate

PublisherPackt

ISBN-139781804612934

Edition1st Edition

Languages

Python

Concepts

Artificial Intelligence

Author (1)

Irene Bratsis

Learning types in ML

In this section, we will cover the differences between supervised, unsupervised, semi-supervised, and reinforcement learning and how all these learning types can be applied. Again, the learning type has to do with whether or not you’re labeling the data and the method you’re using to reward the models you’ve used for good performance. The ultimate objective is to understand what kind of learning model gets you the kind of performance and explainability you’re going to need when considering whether or not to use it in your product.

Supervised learning

If humans are labeling the data and the machine is looking to also correctly label current or future data points, it’s supervised learning. Because we humans know the answer the machines are trying to arrive at, we can see how off they are from finding the correct answer, and we continue this process of training the models and retraining them until we find a level of accuracy that we’re happy with.

Applications of supervised learning models include classification models that are looking to categorize data in the way spam filters do or regression models that are looking for relationships between variables in order to predict future events and find trends. Keep in mind that all models will only work to a certain point, which is why they require constant training and updating and AI teams are often using ensemble modeling or will try various models and choose the best-performing one. It won’t be perfect either way, but with enough hand-holding, it will take you closer and closer to the truth.

The following is a list of common supervised learning models/algorithms you’ll likely use in production for various products:

Naive Bayes classifier: This algorithm naively considers every feature in your dataset as its own independent variable. So, it’s essentially trying to find associations probabilistically without having any assumptions about the data. It’s one of the simpler algorithms out there and its simplicity actually is what makes it so successful with classification. It’s commonly used for binary values such as trying to decipher whether or not something is spam.
Support vector machine (SVM): This algorithm is also largely used for classification problems and will essentially try to split your dataset into two classes so that you can use it to group your data and try to predict where future data points will land along these major splits. If you’re not seeing compelling groups between the data, SVMs allow you to add more dimensions to be able to see groupings easier.
Linear regression models: These have been around since the 1950s and they’re the simplest models we have for regression problems such as predicting future data points. They essentially use one or more variables in your dataset to predict your dependent variable. The linear part of this model is trying to find the best line to fit your data, and this line is what dictates how it predicts. Here, we once again see a relatively simple model also being heavily used because of how versatile and dependable it is.
Logistic regression: This model works a lot like linear regression in that you have independent and dependent variables, but it’s not predicting a numerical value; it’s predicting a future binary categorical state such as whether or not someone might default on a loan in the future, for instance.
Decision trees: This algorithm works well with both predicting something categorical as well as something numerical, so it’s used for both kinds of ML problems, such as predicting a future state or a future price. This is less common so decision trees are used often for both kinds of problems, which has contributed to its popularity. Its comparison to a tree comes from the nodes and branches that effectively function like a flow chart. The model learns from the flow of past data to predict future values.
Random forest: This algorithm builds from the previous decision trees and is also used for both categorical and numerical problems. The way it works is it splits the data into different random”samples, creates decision trees for each sample, and then takes an average or majority vote for its predictions (depending on whether you’re using it for categorical or numerical predictions). It’s hard to understand how a random forest comes to conclusions, so if interpretability isn’t super high on the list of concerns, you can use it.
K-nearest neighbors (KNNs): This algorithm exclusively works on categorical as well as numerical predictions, so it’s looking for a future state and it offers results in groups. The number of data points in the group is set by the engineer/data scientist, and the way the model works is by grouping the data and determining characteristics the data shares with its neighbors and giving its best guess based on those neighbors for future values.

Now that we’ve covered supervised learning, let’s discuss unsupervised learning next.

Unsupervised learning

If the data is unlabeled and we’re using machines to label the data and find patterns we don’t yet know of, it’s unsupervised. Effectively, we humans either know the right answer or we don’t, and that’s how we decipher which camp the ML algorithms belong to. As you might imagine, we take the results of unsupervised learning models with some hesitancy because it may be finding an organization that isn’t actually helpful or accurate. Unsupervised learning models also require large amounts of data to train on because the results can be wildly inaccurate if it’s trying to find patterns out of a small data sample. As it ingests more and more data, its performance will improve and become more refined over time, but once again, there is no correct answer.

Applications of unsupervised learning models include clustering and dimensionality reduction. Clustering models segment or group data into certain areas. These can be used for things such as looking for patterns in medical trials or drug discovery, for instance, because you’re looking for connections and groups of data where there might not already be obvious answers. Dimensionality reduction essentially removes the features in your dataset that contribute less to the performance you’re looking for and will simplify your data so that your most important features will best improve your performance to separate real signals from the noise.

The following is a list of common unsupervised learning models/algorithms you’ll likely use in production for various products:

K-means clustering: This algorithm will group data points together to better see patterns (or clusters), but it’s looking for some optimal number of clusters as well. This is unsupervised learning, so the model is looking to find patterns that it can learn from because it’s not given any information (or supervision) to go off from the engineer that’s using it. Also, the number of clusters assigned is a hyperparameter and you will need to choose what number of clusters is optimal.
Principal component analysis (PCA): Often, the largest problem with using unsupervised ML on very large datasets is there’s actually too much uncorrelated data to find meaningful patterns. This is why PCA is used so often because it’s a great way to reduce dimension without actually losing or discarding information. This is especially useful for massive datasets such as finding patterns in genome sequencing or drug discovery trials.

Next, let’s jump into semi-supervised learning.

Semi-supervised learning

In a perfect world, we’d have massive well-labeled datasets with which to create optimal models that don’t overfit. Overfitting is when you create and tune a model to the dataset you have but it fits a bit too well, which means it’s optimized for that particular dataset and doesn’t work well with more diverse data. This is a common problem in data science. We live in an imperfect world and we can find ourselves in situations where we don’t have enough labeled data or enough data at all. This is where semi-supervised learning comes in handy. We give some labeled datasets and also include a dataset that is unlabeled to essentially give the model nudges in the right direction as it tries to come up with its own semblance of finding patterns.

It doesn’t quite have the same level of absolute truth associated with supervised learning, but it does offer the models some helpful clues with which to organize its results so that it can find an easier path to the right answer.

For instance, let’s say you’re looking for a model that works well with detecting patterns in photos or speech. You might label a few of them and then see how the performance improves over time with the examples you don’t label. You can use multiple models in semi-supervised learning. The process would be a lot like supervised learning, which learns with labeled datasets so that it knows exactly how off it is from being correct. The main difference between supervised learning and semi-supervised learning is that you’re predicting a portion of the new unlabeled data and then, essentially, checking its accuracy against the labeled data. You’re adding unlabeled new data points into the training set so that it’s training on the data it’s gotten correct.

Finally, to wrap up this section, let’s take a brief look at reinforcement learning.

Reinforcement learning

This area of ML effectively learns with trial and error, so it’s learning from past behavior and adapting its approach to finding the best performance by itself. There’s a sequence to reinforcement learning and it’s really a system based on weights and rewards to reinforce correct results. Eventually, the model tries to optimize for these rewards and gets better with time. We see reinforcement learning used a lot with robotics, for instance, where robots are trained to understand how to operate and adjust to the parameters of the real world with all its unpredictability.

Now that we’ve discussed and understood the different ML types, let’s move on and understand the optimal flow of the ML process.

You have been reading a chapter from

The AI Product Manager's Handbook

Published in: Feb 2023Publisher: PacktISBN-13: 9781804612934

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €14.99/month. Cancel anytime

Author (1)

Irene Bratsis

Irene Bratsis is a director of digital product and data at the International WELL Building Institute (IWBI). She has a bachelor's in economics, and after completing various MOOCs in data science and big data analytics, she completed a data science program with Thinkful. Before joining IWBI, Irene worked as an operations analyst at Tesla, a data scientist at Gesture, a data product manager at Beekin, and head of product at Tenacity. Irene volunteers as NYC chapter co-lead for Women in Data, has coordinated various AI accelerators, moderated countless events with a speaker series with Women in AI called WaiTalk, and runs a monthly book club focused on data and AI books.
Read more about Irene Bratsis

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages