You're reading from Hands-On Recommendation Systems with Python

Product typeBook

Published inJul 2018

Reading LevelExpert

PublisherPackt

ISBN-139781788993753

Edition1st Edition

Languages

Python

Tools

TensorFlow Scikit-learn

Concepts

Machine Learning

Author (1)

Rounak Banik

Building Collaborative Filters

In the previous chapter, we mathematically defined the collaborative filtering problem and gained an understanding of various data mining techniques that we assumed would be useful in solving this problem.

The time has finally come for us to put our skills to the test. In the first section, we will construct a well-defined framework that will allow us to build and test our collaborative filtering models effortlessly. This framework will consist of the data, the evaluation metric, and a corresponding function to compute that metric for a given model.

Technical requirements

You will be required to have Python installed on a system. Finally, to use the Git repository of this book, the user needs to install Git.

The code files of this chapter can be found on GitHub:
https://github.com/PacktPublishing/Hands-On-Recommendation-Systems-with-Python.

Check out the following video to see the code in action:

http://bit.ly/2mFmgRo .

The framework

Just like the knowledge-based and content-based recommenders, we will build our collaborative filtering models in the context of movies. Since collaborative filtering demands data on user behavior, we will be using a different dataset known as MovieLens.

The MovieLens dataset

The MovieLens dataset is made publicly available by GroupLens Research, a computer science lab at the University of Minnesota. It is one of the most popular benchmark datasets used to test the potency of various collaborative filtering models and is usually available in most recommender libraries and packages:

MovieLens gives us user ratings on a variety of movies and is available in various sizes. The full version consists of more than...

User-based collaborative filtering

In Chapter 1, Getting Started with Recommender Systems, we learned what user-based collaborative filters do: they find users similar to a particular user and then recommend products that those users have liked to the first user.

In this section, we will implement this idea in code. We will build filters of increasing complexity and gauge their performance using the framework we constructed in the previous section.

To aid us in this process, let's first build a ratings matrix (described in Chapters 1, Getting Started with Recommender Systems and Chapter 5, Getting Started with Data Mining Techniques) where each row represents a user and each column represents a movie. Therefore, the value in the i^th row and j^th column will denote the rating given by user i to movie j. As usual, pandas gives us a very useful function, called pivot_table, to...

Item-based collaborative filtering

Item-based collaborative filtering is essentially user-based collaborative filtering where the users now play the role that items played, and vice versa.

In item-based collaborative filtering, we compute the pairwise similarity of every item in the inventory. Then, given user_id and movie_id, we compute the weighted mean of the ratings given by the user to all the items they have rated. The basic idea behind this model is that a particular user is likely to rate two items that are similar to each other similarly.

Building an item-based collaborative filter is left as an exercise to the reader. The steps involved are exactly the same except now, as mentioned earlier, the movies and users have swapped places.

Model-based approaches

The collaborative filters we have built thus far are known as memory-based filters. This is because they only make use of similarity metrics to come up with their results. They learn any parameters from the data or assign classes/clusters to the data. In other words, they do not make use of machine learning algorithms.

In this section, we will take a look at some filters that do. We spent an entire chapter looking at various supervised and unsupervised learning techniques. The time has finally come to see them in action and test their potency.

Clustering

In our weighted mean-based filter, we took every user into consideration when trying to predict the final rating. In contrast, our demographic-based...

Summary

This brings us to the end of our discussion on collaborative filters. In this chapter, we built various kinds of user-based collaborative filters and, by extension, learned to build item-based collaborative filters as well.

We then shifted our focus to model-based approaches that rely on machine learning algorithms to churn out predictions. We were introduced to the surprise library and used it to implement a clustering model based on kNN. We then took a look at an approach to using supervised learning algorithms to predict the missing values in the ratings matrix. Finally, we gained a layman's understanding of the singular-value decomposition algorithm and implemented it using surprise.

All the recommenders we've built so far reside only inside our Jupyter Notebooks. In the next chapter, we will learn how to deploy our models to the web, where they can be used...

The rest of the chapter is locked

You have been reading a chapter from

Hands-On Recommendation Systems with Python

Published in: Jul 2018Publisher: PacktISBN-13: 9781788993753

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Rounak Banik

Rounak Banik is a Young India Fellow and an ECE graduate from IIT Roorkee. He has worked as a software engineer at Parceed, a New York start-up, and Springboard, an EdTech start-up based in San Francisco and Bangalore. He has also served as a backend development instructor at Acadview, teaching Python and Django to around 35 college students from Delhi and Dehradun. He is an alumni of Springboard's data science career track. He has given talks at the SciPy India Conference and published popular tutorials on Kaggle and DataCamp.
Read more about Rounak Banik

Other recommended products

Related to this chapter

Python Machine Learning Workbook for Beginners

Through a series of machine learning and data science projects, this book represents a beginner-friendly crash course to Python’s practical application in businesses and your own career.

BookMar 2021279 pages

Mastering Predictive Analytics with scikit-learn and TensorFlow

In this book, you will find a range of methods to improve the performance of almost any predictive model, from ensemble methods to dimensionality reduction and cross-validation. You will learn the tools to produce advanced predictive models. In addition, you will dive into the exiting field of Deep Learning using TensorFlow.

BookSep 2018154 pages

Machine Learning with Scala Quick Start Guide

Scala as a programming language is a highly scalable integration of object-oriented and functional programming, which makes it easy to build scalable and complex big data applications. This book is a handy guide for machine learning developers and data scientists who want to train effective machine learning models using this popular language.

BookApr 2019220 pages

R Data Analysis Projects

R offers a large variety of packages and libraries for fast and accurate data analysis and visualization. As a result, it is one of the most popularly used languages by data scientists and analysts, or anyone who wants to perform data analysis. In this book, we show you just how to do that - with the help of practical implementations of real-world use cases.

BookNov 2017366 pages

Supervised Machine Learning with Python

A supervised learning task infers a function from flagged training data and maps an input to an output based on sample input-output pairs. In this book, you will learn various machine learning techniques (such as linear and logistic regression) and gain the practical knowledge you need to quickly and powerfully apply algorithms to new problems.

BookMay 2019162 pages

Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits

This book covers the theory and practice of building data-driven solutions. Includes the end-to-end process, using supervised and unsupervised algorithms. With each algorithm, you will learn the data acquisition and data engineering methods, the apt metrics, and the available hyper-parameters. You will learn how to deploy the models in production.

BookJul 2020384 pages

R Machine Learning Projects

The purpose of the book is to help a machine learning practitioner gets hands-on experience in working with real-world data and apply modern machine learning algorithms. You will learn to implement each algorithm to a specific industry problem. It covers projects involving both supervised as well as unsupervised learning approaches.

BookJan 2019334 pages

Hands-On Data Science and Python Machine Learning

This book will help you take your first steps in the world of data science. It will empower you to conduct data analysis and perform efficient machine learning using Python. You will gain value from your data using the various data mining and data analysis techniques in Python, and develop efficient predictive models to predict future results. You will also learn how to perform large-scale machine learning on Big Data using Apache Spark.

BookJul 2017420 pages

Machine Learning Solutions

This book demonstrates a set of simple to complex problems you may encounter while building machine learning models. You'll not only learn the best possible solutions to these problems but also find out how to build projects based on each problem mentioned in the book, with a practical approach and easy-to-follow examples.

BookApr 2018566 pages

Feature Engineering Made Easy

Feature engineering is the most important step in creating powerful machine learning systems. This book will take you through the entire feature-engineering journey to make your machine learning much more systematic and effective.

BookJan 2018316 pages

Learning Data Mining with Python

The next step in the information age is to gain insights from the deluge of data coming our way. Data mining provides a way of finding these insights, and Python is one of the most popular languages for data mining because it provides both power and flexibility in analysis.

BookApr 2017358 pages

Machine Learning with Spark

Spark ML is the machine learning module of Spark. It uses in-memory RDDs to process machine learning models faster for clustering, classification, and regression.

BookApr 2017532 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages