Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Hands-On Recommendation Systems with Python

You're reading from  Hands-On Recommendation Systems with Python

Product type Book
Published in Jul 2018
Publisher Packt
ISBN-13 9781788993753
Pages 146 pages
Edition 1st Edition
Languages
Author (1):
Rounak Banik Rounak Banik
Profile icon Rounak Banik

Document vectors

Essentially, the models we are building compute the pairwise similarity between bodies of text. But how do we numerically quantify the similarity between two bodies of text?

To put it another way, consider three movies: A, B, and C. How can we mathematically prove that the plot of A is more similar to the plot of B than to that of C (or vice versa)?

The first step toward answering these questions is to represent the bodies of text (henceforth referred to as documents) as mathematical quantities. This is done by representing these documents as vectors. In other words, every document is depicted as a series of n numbers, where each number represents a dimension and n is the size of the vocabulary of all the documents put together.

But what are the values of these vectors? The answer to that question depends on the vectorizer we are using to convert our documents...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}