Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Hands-On Recommendation Systems with Python
Hands-On Recommendation Systems with Python

Hands-On Recommendation Systems with Python: Start building powerful and personalized, recommendation engines with Python

By Rounak Banik
$15.99 per month
Book Jul 2018 146 pages 1st Edition
eBook
$25.99 $17.99
Print
$32.99
Subscription
$15.99 Monthly
eBook
$25.99 $17.99
Print
$32.99
Subscription
$15.99 Monthly

What do you get with a Packt Subscription?

Free for first 7 days. $15.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details


Publication date : Jul 31, 2018
Length 146 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781788993753
Vendor :
Google
Category :
Languages :
Table of content icon View table of contents Preview book icon Preview Book

Hands-On Recommendation Systems with Python

Getting Started with Recommender Systems

Almost everything we buy or consume today is influenced by some form of recommendation; whether that's from friends, family, external reviews, and, more recently, from the sources selling you the product. When you log on to Netflix or Amazon Prime, for example, you will see a list of movies and television shows the service thinks you will like based on your past watching (and rating) history. Facebook suggests people it thinks you may know and would probably like to add. It also curates a News Feed for you based on the posts you've liked, the people you've be-friended, and the pages you've followed. Amazon recommends items to you as you browse for a particular product. It shows you similar products from a competing source and suggests auxiliary items frequently bought together with the product.

So, it goes without saying that providing a good recommendation is at the core of successful business for these companies. It is in Netflix's best interests to engage you with content that you love so that you continue to subscribe to its service; the more relevant the items Amazon shows you, the greater your chances – and volume – of purchases will be, which directly translates to greater profits. Equally, establishing friendship is key to Facebook's power and influence as an almost omnipotent social network, which it then uses to churn money out of advertising.

In this introductory chapter, we will acquaint ourselves with the world of recommender systems, covering the following topics:

  • What is a recommender system? What can it do and not do?
  • The different types of recommender systems

Technical requirements

What is a recommender system?

Recommender systems are pretty self-explanatory; as the name suggests, they are systems or techniques that recommend or suggest a particular product, service, or entity. However, these systems can be classified into the following two categories, based on their approach to providing recommendations.

The prediction problem

In this version of the problem, we are given a matrix of m users and n items. Each row of the matrix represents a user and each column represents an item. The value of the cell in the ith row and the jth column denotes the rating given by user i to item j. This value is usually denoted as rij.

For instance, consider the matrix in the following screenshot:

This matrix has seven users rating six items. Therefore, m = 7 and n = 6. User 1 has given the item 1 a rating of 4. Therefore, r11 = 4.

Let us now consider a more concrete example. Imagine you are Netflix and you have a repository of 20,000 movies and 5,000 users. You have a system in place that records every rating that each user gives to a particular movie. In other words, you have the rating matrix (of shape 5,000 × 20,000) with you.

However, all your users will have seen only a fraction of the movies you have available on your site; therefore, the matrix you have is sparse. In other words, most of the entries in your matrix are empty, as most users have not rated most of your movies.

The prediction problem, therefore, aims to predict these missing values using all the information it has at its disposal (the ratings recorded, data on movies, data on users, and so on). If it is able to predict the missing values accurately, it will be able to give great recommendations. For example, if user i has not used item j, but our system predicts a very high rating (denoted by ij), it is highly likely that i will love j should they discover it through the system.

The ranking problem

Ranking is the more intuitive formulation of the recommendation problem. Given a set of n items, the ranking problem tries to discern the top k items to recommend to a particular user, utilizing all of the information at its disposal.

Imagine you are Airbnb, much like the preceding example. Your user has input the specific things they are looking for in their host and the space (such as their location, and budget). You want to display the top 10 results that satisfy those aforementioned conditions. This would be an example of the ranking problem.

It is easy to see that the prediction problem often boils down to the ranking problem. If we are able to predict missing values, we can extract the top values and display them as our results.

In this book, we will look at both formulations and build systems that effectively solve them.

Types of recommender systems

In recommender systems, as with almost every other machine learning problem, the techniques and models you use (and the success you enjoy) are heavily dependent on the quantity and quality of the data you possess. In this section, we will gain an overview of three of the most popular types of recommender systems in decreasing order of data they require to inorder function efficiently.

Collaborative filtering

Collaborative filtering leverages the power of community to provide recommendations. Collaborative filters are one of the most popular recommender models used in the industry and have found huge success for companies such as Amazon. Collaborative filtering can be broadly classified into two types.

User-based filtering

The main idea behind user-based filtering is that if we are able to find users that have bought and liked similar items in the past, they are more likely to buy similar items in the future too. Therefore, these models recommend items to a user that similar users have also liked. Amazon's Customers who bought this item also bought is an example of this filter, as shown in the following screenshot:

Imagine that Alice and Bob mostly like and dislike the same video games. Now, imagine that a new video game has been launched on the market. Let's say Alice bought the game and loved it. Since we have discerned that their tastes in video games are extremely similar, it's likely that Bob will like the game too; hence, the system recommends the new video game to Bob.

Item-based filtering

If a group of people have rated two items similarly, then the two items must be similar. Therefore, if a person likes one particular item, they're likely to be interested in the other item too. This is the principle on which item-based filtering works. Again, Amazon makes good use of this model by recommending products to you based on your browsing and purchase history, as shown in the following screenshot:

Item-based filters, therefore, recommend items based on the past ratings of users. For example, imagine that Alice, Bob, and Eve have all given War and Peace and The Picture of Dorian Gray a rating of excellent. Now, when someone buys The Brothers Karamazov, the system will recommend War and Peace as it has identified that, in most cases, if someone likes one of those books, they will like the other, too.

Shortcomings

One of the biggest prerequisites of a collaborative filtering system is the availability of data of past activity. Amazon is able to leverage collaborative filters so well because it has access to data concerning millions of purchases from millions of users.

Therefore, collaborative filters suffer from what we call the cold start problem. Imagine you have started an e-commerce website – to build a good collaborative filtering system, you need data on a large number of purchases from a large number of users. However, you don't have either, and it's therefore difficult to build such a system from the start.

Content-based systems

Unlike collaborative filters, content-based systems do not require data relating to past activity. Instead, they provide recommendations based on a user profile and metadata it has on particular items.

Netflix is an excellent example of the aforementioned system. The first time you sign in to Netflix, it doesn't know what your likes and dislikes are, so it is not in a position to find users similar to you and recommend the movies and shows they have liked.

As shown in the previous screenshot, what Netflix does instead is ask you to rate a few movies that you have watched before. Based on this information and the metadata it already has on movies, it creates a watchlist for you. For instance, if you enjoyed the Harry Potter and Narnia movies, the content-based system can identify that you like movies based on fantasy novels and will recommend a movie such as Lord of the Rings to you.

However, since content-based systems don't leverage the power of the community, they often come up with results that are not as impressive or relevant as the ones offered by collaborative filters. In other words, content-based systems usually provide recommendations that are obvious. There is little novelty in a Lord of the Rings recommendation if Harry Potter is your favorite movie.

Knowledge-based recommenders

Knowledge-based recommenders are used for items that are very rarely bought. It is simply impossible to recommend such items based on past purchasing activity or by building a user profile. Take real estate, for instance. Real estate is usually a once-in-a-lifetime purchase for a family. It is not possible to have a history of real estate purchases for existing users to leverage into a collaborative filter, nor is it always feasible to ask a user their real estate purchase history.

In such cases, you build a system that asks for certain specifics and preferences and then provides recommendations that satisfy those aforementioned conditions. In the real estate example, for instance, you could ask the user about their requirements for a house, such as its locality, their budget, the number of rooms, and the number of storeys, and so on. Based on this information, you can then recommend properties that will satisfy all of the above conditions.

Knowledge-based recommenders also suffer from the problem of low novelty, however. Users know full-well what to expect from the results and are seldom taken by surprise.

Hybrid recommenders

As the name suggests, hybrid recommenders are robust systems that combine various types of recommender models, including the ones we've already explained. As we've seen in previous sections, each model has its own set of advantages and disadvantages. Hybrid systems try to nullify the disadvantage of one model against an advantage of another.

Let's consider the Netflix example again. When you sign in for the first time, Netflix overcomes the cold start problem of collaborative filters by using a content-based recommender, and, as you gradually start watching and rating movies, it brings its collaborative filtering mechanism into play. This is far more successful, so most practical recommender systems are hybrid in nature.

In this book, we will build a recommender system of each type and will examine all of the advantages and shortcomings described in the previous sections.

Summary

In this chapter, we gained an overview of the world of recommender systems. We saw two approaches to solving the recommendation problem; namely, prediction and ranking. Finally, we examined the various types of recommender systems and discussed their advantages and disadvantages.

In the next chapter, we will learn to process data with pandas, the data analysis library of choice in Python. This, in turn, will aid us in building the various recommender systems we've introduced.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Build industry-standard recommender systems
  • Only familiarity with Python is required
  • No need to wade through complicated machine learning theory to use this book

Description

Recommendation systems are at the heart of almost every internet business today; from Facebook to Net?ix to Amazon. Providing good recommendations, whether it's friends, movies, or groceries, goes a long way in defining user experience and enticing your customers to use your platform. This book shows you how to do just that. You will learn about the different kinds of recommenders used in the industry and see how to build them from scratch using Python. No need to wade through tons of machine learning theory—you'll get started with building and learning about recommenders as quickly as possible.. In this book, you will build an IMDB Top 250 clone, a content-based engine that works on movie metadata. You'll use collaborative filters to make use of customer behavior data, and a Hybrid Recommender that incorporates content based and collaborative filtering techniques  With this book, all you need to get started with building recommendation systems is a familiarity with Python, and by the time you're fnished, you will have a great grasp of how recommenders work and be in a strong position to apply the techniques that you will learn to your own problem domains.

What you will learn

Get to grips with the different kinds of recommender systems Master data-wrangling techniques using the pandas library Building an IMDB Top 250 Clone Build a content based engine to recommend movies based on movie metadata Employ data-mining techniques used in building recommenders Build industry-standard collaborative filters using powerful algorithms Building Hybrid Recommenders that incorporate content based and collaborative fltering

What do you get with a Packt Subscription?

Free for first 7 days. $15.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details


Publication date : Jul 31, 2018
Length 146 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781788993753
Vendor :
Google
Category :
Languages :

Table of Contents

9 Chapters
Preface Chevron down icon Chevron up icon
Getting Started with Recommender Systems Chevron down icon Chevron up icon
Manipulating Data with the Pandas Library Chevron down icon Chevron up icon
Building an IMDB Top 250 Clone with Pandas Chevron down icon Chevron up icon
Building Content-Based Recommenders Chevron down icon Chevron up icon
Getting Started with Data Mining Techniques Chevron down icon Chevron up icon
Building Collaborative Filters Chevron down icon Chevron up icon
Hybrid Recommenders Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Filter icon Filter
Top Reviews
Rating distribution
Empty star icon Empty star icon Empty star icon Empty star icon Empty star icon 0
(0 Ratings)
5 star 0%
4 star 0%
3 star 0%
2 star 0%
1 star 0%

Filter reviews by


No reviews found
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.