Reader small image

You're reading from  Mastering Predictive Analytics with R

Product typeBook
Published inJun 2015
Reading LevelExpert
Publisher
ISBN-139781783982806
Edition1st Edition
Languages
Tools
Right arrow
Authors (2):
Rui Miguel Forte
Rui Miguel Forte
author image
Rui Miguel Forte

Why do you think this reviewer is suitable for this book? Mr. Rui Miguel Forte has authored a book for Packt titled “Mastering Predictive Analytics with R”. The book has received a 5 star rating. He has 3 years experience as a Data Scientist. He has knowledge of Scala, Python, R, PHP. • Has the reviewer published any articles or blogs on this or a similar tool/technology ? [Provide Links and References] A brief of Unsupervised learning has been covered in his book “Mastering Predictive Analytics with R” https://www.safaribooksonline.com/library/view/mastering-predictive-analytics/9781783982806/ https://www.linkedin.com/profile/view?id=AAkAAAC5YUIBYL7LyLCWZ6LsR0ENJxByC2jU9AU&authType=NAME_SEARCH&authToken=c1Pg&locale=en_US&trk=tyah&trkInfo=clickedVertical%3Amynetwork%2CclickedEntityId%3A12149058%2CauthType%3ANAME_SEARCH%2Cidx%3A1-1-1%2CtarId%3A1444032603690%2Ctas%3ARui%20Miguel%20Forte • Feedback on the Outline (in case outline has been shared with the reviewer) The author said the outline is good to go. • Did the reviewer share any concerns or questions regarding the reviewing process? (related to the schedule, commitment, or any additional comments) No
Read more about Rui Miguel Forte

Rui Miguel Forte
Rui Miguel Forte
author image
Rui Miguel Forte

Rui Miguel Forte is currently the chief data scientist at Workable. He was born and raised in Greece and studied in the UK. He is an experienced data scientist, having over 10 years of work experience in a diverse array of industries spanning mobile marketing, health informatics, education technology, and human resources technology. His projects have included predictive modeling of user behavior in mobile marketing promotions, speaker intent identification in an intelligent tutor, information extraction techniques for job applicant resumes and fraud detection for job scams. He currently teaches R, MongoDB, and other data science technologies to graduate students in the Business Analytics MSc program at the Athens University of Economics and Business. In addition, he has lectured in a number of seminars, specialization programs, and R schools for working data science professionals in Athens. His core programming knowledge is in R and Java, and he has extensive experience working with a variety of database technologies such as Oracle, PostgreSQL, MongoDB, and HBase. He holds a Master’s degree in Electrical and Electronic Engineering from Imperial College London and is currently researching machine learning applications in information extraction and natural language processing.
Read more about Rui Miguel Forte

View More author details
Right arrow

Chapter 11. Recommendation Systems

In our final chapter, we'll tackle one of the most ubiquitous problems prevalent in the e-commerce world, namely that of making effective product recommendations to customers. Recommendation systems, also referred to as recommender systems, often rely on the notion of similarity between objects, in an approach known as collaborative filtering. Its basic premise is that customers can be considered similar to each other if they share most of the products that they have purchased; equally, items can be considered similar to each other if they share a large number of customers who purchased them.

There are a number of different ways to quantify this notion of similarity, and we will present some of the commonly used alternatives. Whether we want to recommend movies, books, hotels, or restaurants, building a recommender system often involves dealing with very large data sets. Consequently, we'll introduce a few ideas and options for working with Big Data using...

Rating matrix


A recommendation system usually involves having a set of users U = {u1, u2, …, um} that have varying preferences on a set of items I = {i1, i2, …, in}. The number of users |U| = m is different from the number of items |I| = n in the general case. In addition, users can often express their preference by rating items on some scale. As an example, we can think of the users as being restaurant patrons in a city, and the items being the restaurants that they visit. Under this setup, the preferences of the users could be expressed as ratings on a five star scale. Of course, our generalization does not require that the items be physical items or that the users be actual people—this is simply an abstraction for the recommender system problem that is commonly used.

As an illustration, think of a dating website in which users rate other users; here, the items that are being rated are the profiles of the actual users themselves. Let's return to our example of a restaurant recommender system...

Collaborative filtering


Having covered distances, we are ready to delve into the topic of collaborative filtering, which will help us define a strategy for making recommendations. Collaborative filtering describes an algorithm, or more precisely a family of algorithms, that aims to create recommendations for a test user given only information about the ratings of other users via the rating matrix, as well as any ratings that the test user has already made.

There are two very common variants of collaborative filtering, memory-based collaborative filtering and model-based collaborative filtering. With memory-based collaborative filtering, the entire history of all the ratings made by all the users is remembered and must be processed in order to make a recommendation. The prototypical memory-based collaborative filtering method is user-based collaborative filtering. Although this approach uses all the ratings available, the downside is that it can be computationally expensive as the entire database...

Singular value decomposition


In a real-world recommender system, the rating matrix will eventually become very large as more users are added to the system and the list of items being offered grows. As a result, we may want to apply a dimensionality reduction technique to this matrix. Ideally, we would like to retain as much information as possible from the original matrix while doing this. One such method that has applications across a wide range of disciplines uses singular value decomposition, or SVD as it is commonly abbreviated to.

SVD is a matrix factorization technique that has a number of useful applications, one of which is dimensionality reduction. It is related to the PCA method of dimensionality reduction that we saw in Chapter 1, Gearing Up for Predictive Modeling, and many people confuse the two. SVD actually describes just a mathematical method of factorizing matrices. In fact, some implementations of PCA use SVD to compute the principal components.

Let's begin by looking at...

R and Big Data


Before we dive deep into building a few recommender systems using real-world data sets, we'll take a short detour and spend some time thinking about Big Data. Many real-world recommender systems arise out of the analysis of massive data sets. Examples include the product recommendation engine of amazon.com and the movie recommendation engine of Netflix.

Most, if not all, of the data sets that we have looked at in this book have been relatively small in size and have been chosen quite intentionally in order for the reader to be able to follow along with the examples and not have to worry about having access to powerful computing resources. These days, the field of predictive analytics, as well as the related fields of machine learning, data science, and data analysis in general, is heavily concerned with the importance of handling Big Data.

Note

The term Big Data has become a buzzword that has entered everyday conversation and as an inevitable result, we often encounter uses that...

Predicting recommendations for movies and jokes


In this chapter, we will focus on building recommender systems using two different data sets. To do this, we shall use the recommenderlab package. This provides us with not only the algorithms to perform the recommendations, but also with the data structures to store the sparse rating matrices efficiently. The first data set we will use contains anonymous user reviews for jokes from the Jester Online Joke recommender system.

The joke ratings fall on a continuous scale (-10 to +10). A number of data sets collected from the Jester system can be found at http://eigentaste.berkeley.edu/dataset/. We will use the data set labeled on the website as Dataset 2+. This data set contains ratings made by 50,692 users on 150 jokes. As is typical with a real-world application, the rating matrix is very sparse in that each user rated only a fraction of all the jokes; the minimum number of ratings made by a user is 8. We will refer to this data set as the jester...

Loading and preprocessing the data


Our first goal in building our recommender systems is to load the data in R, preprocess it, and convert it into a rating matrix. More precisely, in each case, we will be creating a realRatingMatrix object, which is the specific data structure that the recommenderlab package uses to store numerical ratings. We will start with the jester data set. If we download and unzip the archive from the website, we'll see that the file jesterfinal151cols.csv contains the ratings. More specifically, each row in this file corresponds to the ratings made by a particular user, and each column corresponds to a particular joke.

The columns are comma-separated and there is no header row. In fact, the format is almost exactly already a rating matrix were it not for the fact that the first column is a special column and contains the total number of ratings made by a particular user. We will load these data into a data table using the function fread(), which is a fast implementation...

Exploring the data


Before building and evaluating recommender systems using the two data sets we have loaded, it is a good idea to get a feel for the data. For one thing, we can make use of the getRatings() function to retrieve the ratings from a rating matrix. This is useful in order to construct a histogram of item ratings. Additionally, we can also normalize the ratings with respect to each user as we discussed earlier. The following code snippet shows how we can compute ratings and normalized ratings for the jester data. We can then do the same for the MovieLens data and produce histograms for the ratings:

> jester_ratings <- getRatings(jester_rrm)
> jester_normalized_ratings <- getRatings(normalize(jester_rrm, 
                                          method = "Z-score"))

The following plot shows the different histograms:

In the jester data, we can see that ratings above zero are more prominent than ratings below zero, and the most common rating is 10, the maximum rating....

Other approaches to recommendation systems


In this chapter, we concentrated our efforts on building recommendation systems by following the collaborative filtering paradigm. This is a very popular approach for its many advantages. By essentially mimicking word-of-mouth recommendations, it requires virtually no knowledge about the items being recommended nor any background about the users in question.

Moreover, collaborative filtering systems incorporate new ratings as they arise, either through a memory approach, or via regular retraining of a model-based approach. Thus, they naturally become better for their users over time as they learn more information and adapt to changing preferences. On the other hand, they are not without their disadvantages, not least of which is the fact that they will not take into account any information about the items and their content even when it is available.

Content-based recommendation systems try to suggest items to users that are similar to those that users...

Summary


In this chapter, we explored the process of building and evaluating recommender systems in R using the recommenderlab package. We focused primarily on the paradigm of collaborative filtering, which in a nutshell formalizes the idea of recommending items to users through word of mouth. As a general rule, we found that user-based collaborative filtering performs quite quickly but requires all the data to make predictions. Item-based collaborative filtering can be slow to train a model but makes predictions very quickly once the model is trained. It is useful in practice because it does not require us to store all the data. In some scenarios, the tradeoff in accuracy between these two can be high but in others the difference is acceptable.

The process of training recommendation systems is quite resource intensive and a number of important parameters come into play in the design, such as the metrics used to quantify similarity and distance between items and users. As the data sets we...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Mastering Predictive Analytics with R
Published in: Jun 2015Publisher: ISBN-13: 9781783982806
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Rui Miguel Forte

Why do you think this reviewer is suitable for this book? Mr. Rui Miguel Forte has authored a book for Packt titled “Mastering Predictive Analytics with R”. The book has received a 5 star rating. He has 3 years experience as a Data Scientist. He has knowledge of Scala, Python, R, PHP. • Has the reviewer published any articles or blogs on this or a similar tool/technology ? [Provide Links and References] A brief of Unsupervised learning has been covered in his book “Mastering Predictive Analytics with R” https://www.safaribooksonline.com/library/view/mastering-predictive-analytics/9781783982806/ https://www.linkedin.com/profile/view?id=AAkAAAC5YUIBYL7LyLCWZ6LsR0ENJxByC2jU9AU&authType=NAME_SEARCH&authToken=c1Pg&locale=en_US&trk=tyah&trkInfo=clickedVertical%3Amynetwork%2CclickedEntityId%3A12149058%2CauthType%3ANAME_SEARCH%2Cidx%3A1-1-1%2CtarId%3A1444032603690%2Ctas%3ARui%20Miguel%20Forte • Feedback on the Outline (in case outline has been shared with the reviewer) The author said the outline is good to go. • Did the reviewer share any concerns or questions regarding the reviewing process? (related to the schedule, commitment, or any additional comments) No
Read more about Rui Miguel Forte

author image
Rui Miguel Forte

Rui Miguel Forte is currently the chief data scientist at Workable. He was born and raised in Greece and studied in the UK. He is an experienced data scientist, having over 10 years of work experience in a diverse array of industries spanning mobile marketing, health informatics, education technology, and human resources technology. His projects have included predictive modeling of user behavior in mobile marketing promotions, speaker intent identification in an intelligent tutor, information extraction techniques for job applicant resumes and fraud detection for job scams. He currently teaches R, MongoDB, and other data science technologies to graduate students in the Business Analytics MSc program at the Athens University of Economics and Business. In addition, he has lectured in a number of seminars, specialization programs, and R schools for working data science professionals in Athens. His core programming knowledge is in R and Java, and he has extensive experience working with a variety of database technologies such as Oracle, PostgreSQL, MongoDB, and HBase. He holds a Master’s degree in Electrical and Electronic Engineering from Imperial College London and is currently researching machine learning applications in information extraction and natural language processing.
Read more about Rui Miguel Forte