Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Python Machine Learning By Example
Python Machine Learning By Example

Python Machine Learning By Example: Unlock machine learning best practices with real-world use cases , Fourth Edition

eBook
€18.99 €27.99
Paperback
€34.99
Subscription
Free Trial
Renews at €18.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Python Machine Learning By Example

Building a Movie Recommendation Engine with Naïve Bayes

As promised, in this chapter, we will kick off our supervised learning journey with machine learning classification, and specifically, binary classification. The goal of the chapter is to build a movie recommendation system, which is a good starting point for learning classification from a real-life example—movie streaming service providers are already doing this, and we can do the same.

In this chapter, you will learn the fundamental concepts of classification, including what it does and its various types and applications, with a focus on solving a binary classification problem using a simple, yet powerful, algorithm, Naïve Bayes. Finally, the chapter will demonstrate how to fine-tune a model, which is an important skill that every data science or machine learning practitioner should learn.

We will go into detail on the following topics:

  • Getting started with classification
  • Exploring...

Getting started with classification

Movie recommendation can be framed as a machine learning classification problem. If it is predicted that you’ll like a movie because you’ve liked or watched similar movies, for example, then it will be on your recommended list; otherwise, it won’t. Let’s get started by learning the important concepts of machine learning classification.

Classification is one of the main instances of supervised learning. Given a training set of data containing observations and their associated categorical outputs, the goal of classification is to learn a general rule that correctly maps the observations (also called features or predictive variables) to the target categories (also called labels or classes). Putting it another way, a trained classification model will be generated after the model learns from the features and targets of training samples, as shown in the first half of Figure 2.1. When new or unseen data comes in, the trained...

Exploring Naïve Bayes

The Naïve Bayes classifier belongs to the family of probabilistic classifiers. It computes the probabilities of each predictive feature (also referred to as an attribute or signal) of the data belonging to each class in order to make a prediction of the probability distribution over all classes. Of course, from the resulting probability distribution, we can conclude the most likely class that the data sample is associated with. What Naïve Bayes does specifically, as its name indicates, is as follows:

  • Bayes: As in, it maps the probability of observed input features given a possible class to the probability of the class given observed pieces of evidence based on Bayes’ theorem.
  • Naïve: As in, it simplifies probability computation by assuming that predictive features are mutually independent.

I will explain Bayes’ theorem with examples in the next section.

Bayes’ theorem by example

It is important...

Implementing Naïve Bayes

After calculating the movie preference example by hand, as promised, we are going to implement Naïve Bayes from scratch. After that, we will implement it using the scikit-learn package.

Implementing Naïve Bayes from scratch

Before we develop the model, let’s define the toy dataset we just worked with:

>>> import numpy as np
>>> X_train = np.array([
...     [0, 1, 1],
...     [0, 0, 1],
...     [0, 0, 0],
...     [1, 1, 0]])
>>> Y_train = ['Y', 'N', 'Y', 'Y']
>>> X_test = np.array([[1, 1, 0]])

For the model, starting with the prior, we first group the data by label and record their indices by classes:

>>> def get_label_indices(labels):
...     """
...     Group samples based on their labels and return indices
...     @param labels: list of labels
...     @return: dict, {class1: [indices], class2: [indices]}
...     ...

Building a movie recommender with Naïve Bayes

After the toy example, it is now time to build a movie recommender (or, more specifically, movie preference classifier) using a real dataset. We herein use a movie rating dataset (https://grouplens.org/datasets/movielens/). The movie rating data was collected by the GroupLens Research group from the MovieLens website (http://movielens.org).

For demonstration purposes, we will use the stable small dataset, MovieLens 1M Dataset (which can be downloaded from https://files.grouplens.org/datasets/movielens/ml-1m.zip or https://grouplens.org/datasets/movielens/1m/) for ml-1m.zip (size: 1 MB) file). It has around 1 million ratings, ranging from 1 to 5 with half-star increments, given by 6,040 users on 3,706 movies (last updated September 2018).

Unzip the ml-1m.zip file and you will see the following four files:

  • movies.dat: It contains the movie information in the format of MovieID::Title::Genres.
  • ratings.dat: It...

Evaluating classification performance

Beyond accuracy, there are several metrics we can use to gain more insight and avoid class imbalance effects. These are as follows:

  • Confusion matrix
  • Precision
  • Recall
  • F1 score
  • The area under the curve

A confusion matrix summarizes testing instances by their predicted values and true values, presented as a contingency table:

Figure 2.8: Contingency table for a confusion matrix

To illustrate this, we can compute the confusion matrix of our Naïve Bayes classifier. We use the confusion_matrix function from scikit-learn to compute it, but it is very easy to code it ourselves:

>>> from sklearn.metrics import confusion_matrix
>>> print(confusion_matrix(Y_test, prediction, labels=[0, 1]))
[[ 60  47]
 [148 431]]

As you can see from the resulting confusion matrix, there are 47 false positive cases (where the model misinterprets a dislike as a like for a movie), and 148...

Tuning models with cross-validation

Limiting the evaluation to a single fixed set may be misleading since it’s highly dependent on the specific data points chosen for that set. We can simply avoid adopting the classification results from one fixed testing set, which we did in experiments previously. Instead, we usually apply the k-fold cross-validation technique to assess how a model will generally perform in practice.

In the k-fold cross-validation setting, the original data is first randomly divided into k equal-sized subsets, in which class proportion is often preserved. Each of these k subsets is then successively retained as the testing set for evaluating the model. During each trial, the rest of the k -1 subsets (excluding the one-fold holdout) form the training set for driving the model. Finally, the average performance across all k trials is calculated to generate an overall result:

Figure 2.10: Diagram of 3-fold cross-validation

Statistically, the...

Summary

In this chapter, you learned about the fundamental concepts of machine learning classification, including types of classification, classification performance evaluation, cross-validation, and model tuning. You also learned about the simple, yet powerful, classifier, Naïve Bayes. We went in depth through the mechanics and implementations of Naïve Bayes with a couple of examples, the most important one being the movie recommendation project.

Binary classification using Naïve Bayes was the main talking point of this chapter. In the next chapter, we will solve ad click-through prediction using another binary classification algorithm: a decision tree.

Exercises

  1. As mentioned earlier, we extracted user-movie relationships only from the movie rating data where most ratings are unknown. Can you also utilize data from the movies.dat and users.dat files?
  2. Practice makes perfect—another great project to deepen your understanding could be heart disease classification. The dataset can be downloaded directly from https://archive.ics.uci.edu/ml/datasets/Heart+Disease.
  3. Don’t forget to fine-tune the model you obtained from Exercise 2 using the techniques you learned in this chapter. What is the best AUC it achieves?

References

To acknowledge the use of the MovieLens dataset in this chapter, I would like to cite the following paper:

F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Transactions on Interactive Intelligent Systems (TiiS) 5, 4, Article 19 (December 2015), 19 pages. DOI: http://dx.doi.org/10.1145/2827872.

Join our book’s Discord space

Join our community’s Discord space for discussions with the authors and other readers:

https://packt.link/yuxi

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Discover new and updated content on NLP transformers, PyTorch, and computer vision modeling
  • Includes a dedicated chapter on best practices and additional best practice tips throughout the book to improve your ML solutions
  • Implement ML models, such as neural networks and linear and logistic regression, from scratch
  • Purchase of the print or Kindle book includes a free PDF copy

Description

The fourth edition of Python Machine Learning By Example is a comprehensive guide for beginners and experienced machine learning practitioners who want to learn more advanced techniques, such as multimodal modeling. Written by experienced machine learning author and ex-Google machine learning engineer Yuxi (Hayden) Liu, this edition emphasizes best practices, providing invaluable insights for machine learning engineers, data scientists, and analysts. Explore advanced techniques, including two new chapters on natural language processing transformers with BERT and GPT, and multimodal computer vision models with PyTorch and Hugging Face. You’ll learn key modeling techniques using practical examples, such as predicting stock prices and creating an image search engine. This hands-on machine learning book navigates through complex challenges, bridging the gap between theoretical understanding and practical application. Elevate your machine learning and deep learning expertise, tackle intricate problems, and unlock the potential of advanced techniques in machine learning with this authoritative guide.

Who is this book for?

This expanded fourth edition is ideal for data scientists, ML engineers, analysts, and students with Python programming knowledge. The real-world examples, best practices, and code prepare anyone undertaking their first serious ML project.

What you will learn

  • Follow machine learning best practices throughout data preparation and model development
  • Build and improve image classifiers using convolutional neural networks (CNNs) and transfer learning
  • Develop and fine-tune neural networks using TensorFlow and PyTorch
  • Analyze sequence data and make predictions using recurrent neural networks (RNNs), transformers, and CLIP
  • Build classifiers using support vector machines (SVMs) and boost performance with PCA
  • Avoid overfitting using regularization, feature selection, and more

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Jul 31, 2024
Length: 518 pages
Edition : 4th
Language : English
ISBN-13 : 9781835082225
Vendor :
Google
Category :
Languages :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Jul 31, 2024
Length: 518 pages
Edition : 4th
Language : English
ISBN-13 : 9781835082225
Vendor :
Google
Category :
Languages :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
€18.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
€189.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts
€264.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 113.97
Python Machine Learning By Example
€34.99
Mastering PyTorch
€38.99
Mastering NLP from Foundations to LLMs
€39.99
Total 113.97 Stars icon

Table of Contents

17 Chapters
Getting Started with Machine Learning and Python Chevron down icon Chevron up icon
Building a Movie Recommendation Engine with Naïve Bayes Chevron down icon Chevron up icon
Predicting Online Ad Click-Through with Tree-Based Algorithms Chevron down icon Chevron up icon
Predicting Online Ad Click-Through with Logistic Regression Chevron down icon Chevron up icon
Predicting Stock Prices with Regression Algorithms Chevron down icon Chevron up icon
Predicting Stock Prices with Artificial Neural Networks Chevron down icon Chevron up icon
Mining the 20 Newsgroups Dataset with Text Analysis Techniques Chevron down icon Chevron up icon
Discovering Underlying Topics in the Newsgroups Dataset with Clustering and Topic Modeling Chevron down icon Chevron up icon
Recognizing Faces with Support Vector Machine Chevron down icon Chevron up icon
Machine Learning Best Practices Chevron down icon Chevron up icon
Categorizing Images of Clothing with Convolutional Neural Networks Chevron down icon Chevron up icon
Making Predictions with Sequences Using Recurrent Neural Networks Chevron down icon Chevron up icon
Advancing Language Understanding and Generation with the Transformer Models Chevron down icon Chevron up icon
Building an Image Search Engine Using CLIP: a Multimodal Approach Chevron down icon Chevron up icon
Making Decisions in Complex Environments with Reinforcement Learning Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.9
(8 Ratings)
5 star 87.5%
4 star 12.5%
3 star 0%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Jacob Smith Sep 21, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This book is an absolute gem for anyone looking to dive deep into the world of machine learning using Python! From the moment I opened it, I was impressed by the clear, concise explanations and the practical examples that make even the most complex topics easy to understand.The author does a fantastic job of breaking down key machine learning algorithms, explaining not just the "how" but the "why" behind each method. The inclusion of real-world datasets and hands-on exercises makes it easy to follow along and apply what you've learned immediately.
Amazon Verified review Amazon
Ayon Roy Sep 05, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Starting my journey in machine learning was both exciting and overwhelming. I struggled to bridge the gap between theory and practical application in real-world projects. That’s why Yuxi Hayden Liu’s "Python Machine Learning by Example" has been a game-changer for me. This book offers a structured approach, making it easier to transition from learning to execution.Liu covers essential topics like overfitting, underfitting, and cross-validation right from the start, ensuring that you grasp the fundamentals. What truly sets this book apart is the hands-on projects that accompany each concept. From building a movie recommendation engine using Naive Bayes to predicting stock prices and exploring deep learning through artificial neural networks, Liu walks you through each step—from data preparation to model evaluation.The book is rich with best practices, such as feature engineering, algorithm selection, and monitoring model performance. By the end, you'll not only have a solid understanding of basic and advanced topics, including CNNs, transformer models, and reinforcement learning, but you’ll also feel confident applying them in real-world scenarios.Yuxi Hayden Liu’s industry experience shines through, making this book an invaluable guide for anyone feeling lost in their machine learning journey. Highly recommended for both students and professionals looking to elevate their skills. Happy reading!
Amazon Verified review Amazon
C. C Chin Oct 14, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Need hands on ML newbie!!Also Python newbie too but got computer science degree!!Ready all 5* reviews, book perfect for Machine learning newbie and Python newbie and AWS MLS-C01 exam and entry level machine learning specalty exam and Sagemaker studio!!All new for me!!!Need examples to make practice exams answers to help for AWS mls-c01 machine learning specalty exam AWS Sagemaker studio too, since all new to me!!!Got book October 13, 2024!! And pdf too!!Reading now to do ML example!!Got Oliver beginner book, udemy classBook 3 months old pretty new, October 14,2024!!!Explain Oliver beginner book got 3 of those!!
Amazon Verified review Amazon
saandeep sreerambatla Jul 31, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
"Python Machine Learning by Example, Fourth Edition" by Yuxi (Hayden) Liu is a fantastic resource for anyone interested in machine learning, whether you're just starting out or already have some experience. This book strikes a great balance between explaining the theory behind machine learning and showing you how to apply it in real-world scenarios, making it an essential addition to any data scientist’s collection.The book is well-organized, kicking off with the basics of machine learning and Python programming. Liu does an excellent job of explaining why machine learning is so important today and then helps you set up your Python environment. This ensures that even those with minimal programming experience can keep up.What really stands out about this book is its hands-on approach. Each chapter is packed with real-world examples that help bring complex machine learning concepts to life. For instance, the chapters on building a movie recommendation engine with Naïve Bayes and predicting stock prices with regression algorithms are particularly insightful, showing you exactly how these models work and how to apply them to real problems.The book also covers advanced topics like deep learning, natural language processing (NLP), and reinforcement learning. The sections on convolutional neural networks (CNNs) for image classification and recurrent neural networks (RNNs) for sequence prediction are especially useful. They provide a deep dive into these advanced models, complete with code examples using TensorFlow and PyTorch, which are incredibly helpful for anyone looking to implement these techniques in their own projects.Another great feature of this book is the focus on best practices. Liu includes 21 best practices that cover the entire machine learning workflow, from data preparation to model deployment and monitoring. This is invaluable for anyone looking to build robust and scalable machine learning solutions.It's worth noting that the book assumes you have a basic understanding of Python and some familiarity with statistical concepts. This might be a bit challenging for complete beginners, but it doesn't take away from the overall value of the book. Instead, it sets a realistic expectation for the level of expertise needed to fully benefit from the content.In conclusion, "Python Machine Learning by Example, Fourth Edition" is an excellent resource that bridges the gap between theory and practice. Yuxi (Hayden) Liu's clear explanations, practical examples, and focus on best practices make this book a must-read for anyone serious about mastering machine learning with Python. Whether you're a data analyst, a machine learning engineer, or a data scientist, this book will provide you with the tools and knowledge you need to succeed.
Amazon Verified review Amazon
Thomas M. Aug 21, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I highly recommend Liu's Python ML by Example! As a long term practitioner of all things analytics and data science, it was refreshing to come back to the foundations with this book. I wish I had this resource available when I was originally getting started in the field, as Liu has a knack for covering a broad range of salient topics in ML, while still offering plenty of depth for those looking to go into the weeds of how algorithms work. Super practical, this book focuses on real-life examples, spanning marketing & ads, content recommendations, text sentiment, image classification and beyond. The book also navigates tabular ML and deep learning concepts flawlessly. Liu doesn't stop at the fundamentals; the book also covers advanced topics like deep learning, natural language processing (NLP), and reinforcement learning. The sections on convolutional neural networks (CNNs) for image classification and recurrent neural networks (RNNs) for sequence prediction offer valuable insights into these cutting-edge techniques. These topics area all presented in ways that even new-to-ML readers would be able to grasp. These days, no ML book is complete without including GenAI as a topic, which the author integrates seamlessly. All around a super well rounded and practical read!
Amazon Verified review Amazon