Reader small image

You're reading from  Big Data Analytics with Java

Product typeBook
Published inJul 2017
Reading LevelIntermediate
PublisherPackt
ISBN-139781787288980
Edition1st Edition
Languages
Concepts
Right arrow
Author (1)
RAJAT MEHTA
RAJAT MEHTA
author image
RAJAT MEHTA

The author is a VP (Technical Architect) in technology in JP Morgan Chase in New York. The author is a sun certified java developer and has worked on java related technologies for more than 16 years. Current role for the past few years heavily involves the usage of bid data stack and running analytics on it. Author is also a contributor in various open source projects that are available on his GitHub repository and is also a frequent write on dev magazines.
Read more about RAJAT MEHTA

Right arrow

Chapter 9. Recommendation Systems

When you go to a bookstore to buy books, you have a particular book in mind generally, which you are interested in buying and you look for that particular book in the bookshelves. Usually, in the book store, the top selling books at that point in time are kept upfront and the remaining inventory is kept on the shelves arranged (sorted). A typical small bookstore can have say a few thousand books or maybe more. So, in short, the limit to which the physical products are available is right in front of you as a customer and you can pick and choose what you like at that moment. Also, physical stores keep top products in front as they are more sellable, but there is no way the products can be arranged according to the choice or preference of a customer coming to a physical store. However, this is not the case when you go to popular online e-commerce store such as Amazon or Walmart. There could be a million if not a billion products on Amazon when you go to buy...

Recommendation systems and their types


Before we dig deeper into the concepts of the recommendation system, let's see two real-world examples of recommendation engines that we might be using on a daily basis. The examples are shown in the following screenshots. The first screenshot is from Amazon.com, where we can see a section called Customers who bought this also bought, and the second screenshot will be from YouTube.com, where we are seeing a section called Recommended:

As you can see in the screenshot which we have taken from http://www.amazon.com, it shows a list of books on Java. So if you search for keyword Core Java on Amazon.com for buying books, you will get a list books on Core Java. If you select one of those core java books now and click on it, you will be directed to the page where you will get the full description about the book: its price, author, reviews, and so on. It is here, at the bottom of this section you will get a link as shown above where it is mentioned Customers...

Content-based recommendation systems


In content-based recommendations, the recommendation systems check for similarity between the items based on their attributes or content and then propose those items to the end users. For example, if there is a movie and the recommendation system has to show similar movies to the users, then it might check for the attributes of the movie such as the director name, the actors in the movie, the genre of the movie, and so on or if there is a news website and the recommendation system has to show similar news then it might check for the presence of certain words within the news articles to build the similarity criteria. As such the recommendations are based on actual content whether in the form of tags, metadata, or content from the item itself (as in the case of news articles).

Let's try to understand content-based recommendation using the following diagram:

As you can see in the preceding diagram, there are four movies each with a specific director and genre...

Summary


In this chapter, we learned about recommendation engines. We saw the two types of recommendation engines, that is, content recommenders and collaborative filtering recommenders. We learned how content recommenders can be built on zero to no historical data and are based on the attributes present on the item itself, using which, we figure out the similarity with other items and recommend them. Later, we worked on a collaborative filtering example using the same MovieLens dataset and the Apache Spark alternating least square recommender. We learned that collaborative filtering is based on historical data of users' activity, based on which other similar users are figured out and the products they liked are recommended to the other users.

In the next chapter, we will learn two important algorithms that are part of the unsupervised learning world and they will help us form clusters or groups in unlabeled data. We will also see how these algorithms help us segment the important customers...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Big Data Analytics with Java
Published in: Jul 2017Publisher: PacktISBN-13: 9781787288980
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
RAJAT MEHTA

The author is a VP (Technical Architect) in technology in JP Morgan Chase in New York. The author is a sun certified java developer and has worked on java related technologies for more than 16 years. Current role for the past few years heavily involves the usage of bid data stack and running analytics on it. Author is also a contributor in various open source projects that are available on his GitHub repository and is also a frequent write on dev magazines.
Read more about RAJAT MEHTA