Reader small image

You're reading from  Natural Language Processing with Python Quick Start Guide

Product typeBook
Published inNov 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781789130386
Edition1st Edition
Languages
Right arrow
Author (1)
Nirant Kasliwal
Nirant Kasliwal
author image
Nirant Kasliwal

Nirant Kasliwal maintains an awesome list of NLP natural language processing resources. GitHub's machine learning collection features this as the go-to guide. Nobel Laureate Dr. Paul Romer found his programming notes on Jupyter Notebooks helpful. Nirant won the first ever NLP Google Kaggle Kernel Award. At Soroco, image segmentation and intent categorization are the challenges he works with. His state-of-the-art language modeling results are available as Hindi2vec.
Read more about Nirant Kasliwal

Right arrow

Building your Own Chatbot

Chatbots, better referred to as conversation software, are amazing tools for a lot of businesses. They help businesses serve their client's server 24/7 without increasing effort, with consistent quality, and the built-in option to defer to a human when bots are not enough.

They are a great example of where technology and AI has come together to improve the impact of human effort.

They range from voice-based solutions such as Alexa, to text-based Intercom chat boxes, to menu-based navigation in Uber.

A common misconception is that building chatbots needs large teams and a lot of machine learning expertise, though this is true if you are trying to build a generic chatbot platform like Microsoft or Facebook (or even Luis, Wit.ai, and so on).

In this chapter, we will cover the following topics:

  • Why build a chatbot?
  • Figuring out the right user intent...

Why chatbots as a learning example?

So far, we have built an application for every NLP topic that we have seen:

  • Text cleaning using grammar and vocabulary insights
  • Linguistics (and statistical parsers), to mine questions from text
  • Entity recognition for information extraction
  • Supervised text classification using both machine learning and deep learning
  • Text similarity using text-based vectors such as GloVe/word2vec

We will now combine all of them into a much more complicated setup and write our own chatbot from scratch. But, before you build anything from scratch, you should ask why.

Why build a chatbot?

A related questions is why should we build our own chatbots? Why can't I use FB/MSFT/some other cloud service?

Perhaps...

Quick code means word vectors and heuristics

For the sake of simplicity, we will assume that our bot does not need to remember the context of any question. Therefore it sees input, responds to it, and is done. No links are established with the previous input.

Let's start by simply loading the word vectors using gensim:

import numpy as np
import gensim
print(f"Gensim version: {gensim.__version__}")

from tqdm import tqdm
class TqdmUpTo(tqdm):
def update_to(self, b=1, bsize=1, tsize=None):
if tsize is not None: self.total = tsize
self.update(b * bsize - self.n)

def get_data(url, filename):
"""
Download data if the filename does not exist already
Uses Tqdm to show download progress
"""
import os
from urllib.request import urlretrieve

if not os.path.exists(filename):

dirname = os.path.dirname(filename...

Summary

In this chapter on chatbots, we learned about intent, which usually refers to the user input, response, which is via the bot, templates, which defines the nature of bot responses, and entities, such as cuisine type, in our example.

Additionally, to understand the user intent—and even find entities—we used unsupervised approaches , that is, we did not have training examples this time. In practice, most commercial systems use a hybrid system, combining supervised and unsupervised systems.

The one thing you should take away from here is that we don't need a lot of training data to make the first usable version of a bot for a specific use case.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Natural Language Processing with Python Quick Start Guide
Published in: Nov 2018Publisher: PacktISBN-13: 9781789130386
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Nirant Kasliwal

Nirant Kasliwal maintains an awesome list of NLP natural language processing resources. GitHub's machine learning collection features this as the go-to guide. Nobel Laureate Dr. Paul Romer found his programming notes on Jupyter Notebooks helpful. Nirant won the first ever NLP Google Kaggle Kernel Award. At Soroco, image segmentation and intent categorization are the challenges he works with. His state-of-the-art language modeling results are available as Hindi2vec.
Read more about Nirant Kasliwal