Reader small image

You're reading from  Natural Language Processing with Python Quick Start Guide

Product typeBook
Published inNov 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781789130386
Edition1st Edition
Languages
Right arrow
Author (1)
Nirant Kasliwal
Nirant Kasliwal
author image
Nirant Kasliwal

Nirant Kasliwal maintains an awesome list of NLP natural language processing resources. GitHub's machine learning collection features this as the go-to guide. Nobel Laureate Dr. Paul Romer found his programming notes on Jupyter Notebooks helpful. Nirant won the first ever NLP Google Kaggle Kernel Award. At Soroco, image segmentation and intent categorization are the challenges he works with. His state-of-the-art language modeling results are available as Hindi2vec.
Read more about Nirant Kasliwal

Right arrow

Word representations

The most popular names in word embedding are word2vec by Google (Mikolov) and GloVe by Stanford (Pennington, Socher, and Manning). fastText seems to be fairly popular for multilingual sub-word embeddings.

We advise that you don't use word2vec or GloVe. Instead, use fastText vectors, which are much better and from the same authors. word2vec was introduced by T. Mikolov et. al. (https://scholar.google.com/citations?user=oBu8kMMAAAAJ&hl=en) when he was with Google, and it performs well on word similarity and analogy tasks.

GloVe was introduced by Pennington, Socher, and Manning from Stanford in 2014 as a statistical approximation for word embedding. The word vectors are created by the matrix factorization of word-word co-occurrence matrices.

If picking between the lesser of two evils, we recommend using GloVe over word2vec. This is because GloVe outperforms...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Natural Language Processing with Python Quick Start Guide
Published in: Nov 2018Publisher: PacktISBN-13: 9781789130386

Author (1)

author image
Nirant Kasliwal

Nirant Kasliwal maintains an awesome list of NLP natural language processing resources. GitHub's machine learning collection features this as the go-to guide. Nobel Laureate Dr. Paul Romer found his programming notes on Jupyter Notebooks helpful. Nirant won the first ever NLP Google Kaggle Kernel Award. At Soroco, image segmentation and intent categorization are the challenges he works with. His state-of-the-art language modeling results are available as Hindi2vec.
Read more about Nirant Kasliwal