Reader small image

You're reading from  The AI Product Manager's Handbook

Product typeBook
Published inFeb 2023
Reading LevelIntermediate
PublisherPackt
ISBN-139781804612934
Edition1st Edition
Languages
Right arrow
Author (1)
Irene Bratsis
Irene Bratsis
author image
Irene Bratsis

Irene Bratsis is a director of digital product and data at the International WELL Building Institute (IWBI). She has a bachelor's in economics, and after completing various MOOCs in data science and big data analytics, she completed a data science program with Thinkful. Before joining IWBI, Irene worked as an operations analyst at Tesla, a data scientist at Gesture, a data product manager at Beekin, and head of product at Tenacity. Irene volunteers as NYC chapter co-lead for Women in Data, has coordinated various AI accelerators, moderated countless events with a speaker series with Women in AI called WaiTalk, and runs a monthly book club focused on data and AI books.
Read more about Irene Bratsis

Right arrow

Learning types in ML

In this section, we will cover the differences between supervised, unsupervised, semi-supervised, and reinforcement learning and how all these learning types can be applied. Again, the learning type has to do with whether or not you’re labeling the data and the method you’re using to reward the models you’ve used for good performance. The ultimate objective is to understand what kind of learning model gets you the kind of performance and explainability you’re going to need when considering whether or not to use it in your product.

Supervised learning

If humans are labeling the data and the machine is looking to also correctly label current or future data points, it’s supervised learning. Because we humans know the answer the machines are trying to arrive at, we can see how off they are from finding the correct answer, and we continue this process of training the models and retraining them until we find a level of accuracy that we’re happy with.

Applications of supervised learning models include classification models that are looking to categorize data in the way spam filters do or regression models that are looking for relationships between variables in order to predict future events and find trends. Keep in mind that all models will only work to a certain point, which is why they require constant training and updating and AI teams are often using ensemble modeling or will try various models and choose the best-performing one. It won’t be perfect either way, but with enough hand-holding, it will take you closer and closer to the truth.

The following is a list of common supervised learning models/algorithms you’ll likely use in production for various products:

  • Naive Bayes classifier: This algorithm naively considers every feature in your dataset as its own independent variable. So, it’s essentially trying to find associations probabilistically without having any assumptions about the data. It’s one of the simpler algorithms out there and its simplicity actually is what makes it so successful with classification. It’s commonly used for binary values such as trying to decipher whether or not something is spam.
  • Support vector machine (SVM): This algorithm is also largely used for classification problems and will essentially try to split your dataset into two classes so that you can use it to group your data and try to predict where future data points will land along these major splits. If you’re not seeing compelling groups between the data, SVMs allow you to add more dimensions to be able to see groupings easier.
  • Linear regression models: These have been around since the 1950s and they’re the simplest models we have for regression problems such as predicting future data points. They essentially use one or more variables in your dataset to predict your dependent variable. The linear part of this model is trying to find the best line to fit your data, and this line is what dictates how it predicts. Here, we once again see a relatively simple model also being heavily used because of how versatile and dependable it is.
  • Logistic regression: This model works a lot like linear regression in that you have independent and dependent variables, but it’s not predicting a numerical value; it’s predicting a future binary categorical state such as whether or not someone might default on a loan in the future, for instance.
  • Decision trees: This algorithm works well with both predicting something categorical as well as something numerical, so it’s used for both kinds of ML problems, such as predicting a future state or a future price. This is less common so decision trees are used often for both kinds of problems, which has contributed to its popularity. Its comparison to a tree comes from the nodes and branches that effectively function like a flow chart. The model learns from the flow of past data to predict future values.
  • Random forest: This algorithm builds from the previous decision trees and is also used for both categorical and numerical problems. The way it works is it splits the data into different random”samples, creates decision trees for each sample, and then takes an average or majority vote for its predictions (depending on whether you’re using it for categorical or numerical predictions). It’s hard to understand how a random forest comes to conclusions, so if interpretability isn’t super high on the list of concerns, you can use it.
  • K-nearest neighbors (KNNs): This algorithm exclusively works on categorical as well as numerical predictions, so it’s looking for a future state and it offers results in groups. The number of data points in the group is set by the engineer/data scientist, and the way the model works is by grouping the data and determining characteristics the data shares with its neighbors and giving its best guess based on those neighbors for future values.

Now that we’ve covered supervised learning, let’s discuss unsupervised learning next.

Unsupervised learning

If the data is unlabeled and we’re using machines to label the data and find patterns we don’t yet know of, it’s unsupervised. Effectively, we humans either know the right answer or we don’t, and that’s how we decipher which camp the ML algorithms belong to. As you might imagine, we take the results of unsupervised learning models with some hesitancy because it may be finding an organization that isn’t actually helpful or accurate. Unsupervised learning models also require large amounts of data to train on because the results can be wildly inaccurate if it’s trying to find patterns out of a small data sample. As it ingests more and more data, its performance will improve and become more refined over time, but once again, there is no correct answer.

Applications of unsupervised learning models include clustering and dimensionality reduction. Clustering models segment or group data into certain areas. These can be used for things such as looking for patterns in medical trials or drug discovery, for instance, because you’re looking for connections and groups of data where there might not already be obvious answers. Dimensionality reduction essentially removes the features in your dataset that contribute less to the performance you’re looking for and will simplify your data so that your most important features will best improve your performance to separate real signals from the noise.

The following is a list of common unsupervised learning models/algorithms you’ll likely use in production for various products:

  • K-means clustering: This algorithm will group data points together to better see patterns (or clusters), but it’s looking for some optimal number of clusters as well. This is unsupervised learning, so the model is looking to find patterns that it can learn from because it’s not given any information (or supervision) to go off from the engineer that’s using it. Also, the number of clusters assigned is a hyperparameter and you will need to choose what number of clusters is optimal.
  • Principal component analysis (PCA): Often, the largest problem with using unsupervised ML on very large datasets is there’s actually too much uncorrelated data to find meaningful patterns. This is why PCA is used so often because it’s a great way to reduce dimension without actually losing or discarding information. This is especially useful for massive datasets such as finding patterns in genome sequencing or drug discovery trials.

Next, let’s jump into semi-supervised learning.

Semi-supervised learning

In a perfect world, we’d have massive well-labeled datasets with which to create optimal models that don’t overfit. Overfitting is when you create and tune a model to the dataset you have but it fits a bit too well, which means it’s optimized for that particular dataset and doesn’t work well with more diverse data. This is a common problem in data science. We live in an imperfect world and we can find ourselves in situations where we don’t have enough labeled data or enough data at all. This is where semi-supervised learning comes in handy. We give some labeled datasets and also include a dataset that is unlabeled to essentially give the model nudges in the right direction as it tries to come up with its own semblance of finding patterns.

It doesn’t quite have the same level of absolute truth associated with supervised learning, but it does offer the models some helpful clues with which to organize its results so that it can find an easier path to the right answer.

For instance, let’s say you’re looking for a model that works well with detecting patterns in photos or speech. You might label a few of them and then see how the performance improves over time with the examples you don’t label. You can use multiple models in semi-supervised learning. The process would be a lot like supervised learning, which learns with labeled datasets so that it knows exactly how off it is from being correct. The main difference between supervised learning and semi-supervised learning is that you’re predicting a portion of the new unlabeled data and then, essentially, checking its accuracy against the labeled data. You’re adding unlabeled new data points into the training set so that it’s training on the data it’s gotten correct.

Finally, to wrap up this section, let’s take a brief look at reinforcement learning.

Reinforcement learning

This area of ML effectively learns with trial and error, so it’s learning from past behavior and adapting its approach to finding the best performance by itself. There’s a sequence to reinforcement learning and it’s really a system based on weights and rewards to reinforce correct results. Eventually, the model tries to optimize for these rewards and gets better with time. We see reinforcement learning used a lot with robotics, for instance, where robots are trained to understand how to operate and adjust to the parameters of the real world with all its unpredictability.

Now that we’ve discussed and understood the different ML types, let’s move on and understand the optimal flow of the ML process.

Previous PageNext Page
You have been reading a chapter from
The AI Product Manager's Handbook
Published in: Feb 2023Publisher: PacktISBN-13: 9781804612934
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Irene Bratsis

Irene Bratsis is a director of digital product and data at the International WELL Building Institute (IWBI). She has a bachelor's in economics, and after completing various MOOCs in data science and big data analytics, she completed a data science program with Thinkful. Before joining IWBI, Irene worked as an operations analyst at Tesla, a data scientist at Gesture, a data product manager at Beekin, and head of product at Tenacity. Irene volunteers as NYC chapter co-lead for Women in Data, has coordinated various AI accelerators, moderated countless events with a speaker series with Women in AI called WaiTalk, and runs a monthly book club focused on data and AI books.
Read more about Irene Bratsis