LDA topic modeling with gensim
In the previous section, we saw how to create an LDA model with the sklearn package. In this recipe, we will create an LDA model using the gensim package.
Getting ready
We will be using the gensim package, which can be installed using the following command:
pip install gensim
How to do it…
We will load the data, clean it, preprocess it in a similar fashion to the previous recipe, and then create the LDA model. The steps for this recipe are as follows:
- Perform the necessary imports:
import re import pandas as pd from gensim.models.ldamodel import LdaModel import gensim.corpora as corpora from gensim.utils import simple_preprocess import matplotlib.pyplot as plt from pprint import pprint from Chapter06.lda_topic import stopwords, bbc_dataset, clean_data
- Define the function that will preprocess the data. It uses the
clean_datafunction from the previous recipe:def preprocess(df): df = clean_data(df...