In this article by Rik Van Bruggen, the author of Learning Neo4j, we will take a look at so-called recommender systems. Such systems, broadly speaking, consist of two elementary parts:
(For more resources related to this topic, see here.)
A pattern discovery system: This is the system that somehow figures out what would be a useful recommendation for a particular target group. This discovery can be done in many different ways, but in general we see three ways to do so:
A business expert who thoroughly understands the domain of the graph database application will use this understanding to determine useful recommendations. For example, the supervisor of a "do-it-yourself" retail outlet would understand a particular pattern. Suppose that if someone came in to buy multiple pots of paint, they would probably also benefit from getting a gentle recommendation for a promotion of high-end brushes. The store has that promotion going on right then, so the recommendation would be very timely. This process would be a discovery process where the pattern that the business expert has discovered would be applied and used in graph databases in real time as part of a sophisticated recommendation.
A visual discovery of a specific pattern in the graph representation of the business domain. We have found that in many different projects, business users have stumbled upon these kinds of patterns while using graph visualizations to look at their data. Specific patterns emerge, unexpected relations jump out, or, in more advanced visualization solutions, specific clusters of activity all of a sudden become visible and require further investigation. Graph databases such as Neo4j, and the visualization solutions that complement it, can play a wonderfully powerful role in this process.
An algorithmic discovery of a pattern in a dataset of the business domain using machine learning algorithms to uncover previously unknown patterns in the data. Typically, these processes require an iterative, raw number-crunching approach that has also been used on non-graph data formats in the past. It remains part-art part-science at this point, but can of course yield interesting insights if applied correctly.
An overview of recommender systems
A pattern application system: All recommender systems will be more or less successful based on not just the patterns that they are able to discover, but also based on the way that they are able to apply these patterns in business applications. We can look at different types of applications for these patterns:
Batch-oriented applications: Some applications of these patterns are not as time-critical as one would expect. It does not really matter in any material kind of way if a bulk e-mail with recommendations, or worse, a printed voucher with recommended discounted products, gets delivered to the customer or prospect at 10 am or 11 am. Batch solutions can usually cope with these kinds of requests, even if they do so in the most inefficient way.
Real-time oriented applications: Some pattern applications simply have to be delivered in real time, in between a web request and a web response, and cannot be precalculated. For these types of systems, which typically use more complex database queries in order to match for the appropriate recommendation to make, graph databases such as Neo4j are a fantastic tool to have. We will illustrate this going forward.
With this classification behind us, we will look at an example dataset and some example queries to make this topic come alive.
Using a graph model for recommendations
We will be using a very specific data model for our recommender system. All we have changed is that we added a couple of products and brands to the model, and inserted some data into the database correspondingly. In total, we added the following:
Three product brands
Fifty relationships between existing person nodes and the mentioned products, highlighting that these persons bought these products
These are the products and brands that we added:
Adding products and brands to the dataset
The following diagram shows the resulting model:
In Neo4j, that model will look something like the following:
A dataset like this one, while of course a broad simplification, offers us some interesting possibilities for a recommender system. Let's take a look at some queries that could really match this use case, and that would allow us to either visually or in real time exploit the data in this dataset in a product recommendation application.
This article elaborately dissected the so-called recommender systems.
Resources for Article:
- Working with a Neo4j Embedded Database [article]
- Comparative Study of NoSQL Products [article]
- Getting Started with CouchDB and Futon [article]