Creating word clouds
You may have seen word clouds produced by Wordle or others before. If not, you will see them soon enough in this chapter. A couple of Python libraries can create word clouds; however, these libraries don't seem to beat the quality produced by Wordle yet. We can create a word cloud via the Wordle web page on http://www.wordle.net/advanced. Wordle requires a list of words and weights in the following format:
Word1 : weight Word2 : weight
Modify the code from the previous example to print the word list. As a metric, we will use the word frequency and select the top percent. We don't need anything new and the final code is in the cloud.py file in this book's code bundle:
from nltk.corpus import movie_reviews
from nltk.corpus import stopwords
from nltk import FreqDist
import string
sw = set(stopwords.words('english'))
punctuation = set(string.punctuation)
def isStopWord(word):
return word in sw or word in punctuation
review_words = movie_reviews...