Reader small image

You're reading from  Conversational AI with Rasa

Product typeBook
Published inOct 2021
PublisherPackt
ISBN-139781801077057
Edition1st Edition
Tools
Right arrow
Authors (2):
Xiaoquan Kong
Xiaoquan Kong
author image
Xiaoquan Kong

Xiaoquan is a machine learning expert specializing in NLP applications. He has extensive experience in leading teams to build NLP platforms in several Fortune Global 500 companies. He is a Google developer expert in Machine Learning and has been actively involved in contributions to TensorFlow for many years. He also has actively contributed to the development of the Rasa framework since the early stage and became a Rasa Superhero in 2018. He manages the Rasa Chinese community and has also participated in the Chinese localization of TensorFlow documents as a technical reviewer.
Read more about Xiaoquan Kong

Guan Wang
Guan Wang
author image
Guan Wang

Guan is currently working on Al applications and research for the insurance industry. Prior to that, he was a machine learning researcher at several industry Al labs. He was raised and educated in Mainland China, lived in Hong Kong for 10 years before relocating to Singapore in 2020. Guan holds BSc degrees in Physics and Computer Science from Peking University, and an MPhil degree in Physics from HKUST. Guan is an active tech blogger and community contributor to open source projects including Rasa, receiving more than10,000 stars for his own projects on Github.
Read more about Guan Wang

View More author details
Right arrow

Chapter 2: Natural Language Understanding in Rasa

In this chapter, we introduce how to implement Natural Language Understanding (NLU) in Rasa.

Rasa NLU is responsible for intent recognition and entity extraction. For example, if the user input is What's the weather like tomorrow in New York?, Rasa NLU needs to extract that the intent of the user is asking for weather, and the corresponding entity names and type, for example, the date is tomorrow, and the location is New York.

Rasa NLU uses supervised learning algorithms to fulfill this function. A proper number of examples including intent and entity information are needed for training the NLU model. Rasa NLU has a very flexible software architecture design and supports various kinds of algorithms. The implementations of those algorithms are called components. Components also need to be carefully configured and maintain a correct dependency relationship between their upstream and downstream components. Rasa NLU introduces...

Technical requirements

You can find all the files for this chapter in the ch02 directory of the GitHub repository at https://github.com/PacktPublishing/Conversational-AI-with-RASA.

The format of NLU training data

In the previous chapter, we created an example project by using a command-line tool of Rasa. The project layout is as follows:

.
├── actions
│   ├── actions.py
│   └── __init__.py
├── config.yml
├── credentials.yml
├── data
│   ├── nlu.yml
│   ├── rules.yml
│   └── stories.yml
├── domain.yml
├── endpoints.yml
└── tests
    └── test_stories.yml

The data/nlu.yml file in the project acts as the training data file for Rasa NLU. The training data file is written in YAML (short for YAML Ain't Markup Language) format. YAML is a general format for data storage and exchange. It...

Overview of Rasa NLU components

Rasa NLU is a pipeline-based general framework. This gives Rasa great flexibility.

A pipeline defines the data processing order for each component. There are dependencies between certain components. One failure in such dependency requirements will fail the whole pipeline. Rasa NLU checks the dependency requirements for each and every component. If any of those dependency requirements fail, Rasa will stop the program and give corresponding errors and warnings.

One NLU application normally includes both an intent recognition task and entity extraction task. To accomplish those tasks, here is a typical Rasa NLU pipeline:

Figure 2.3 – A typical Rasa NLU pipeline

Let's look at the components within this typical Rasa NLU pipeline:

  • Language model component: This loads the language model files to support the following components. For example, spaCy and MITIE can be initiated here.
  • Tokenizer component: This...

Configuring your Rasa NLU via a pipeline

As mentioned in the previous section, Rasa NLU is a general framework based on pipelines. This gives Rasa NLU maximum flexibility.

What is a pipeline?

A pipeline in Rasa defines the dependency relationship and data flow direction between the different components, and it allows the developer to configure each of the components. The pipeline gives the Rasa framework great flexibility and extensibility. We will discuss the extensibility advantages of pipelines in Chapter 8, Working Principles and Customization of Rasa.

In the next section, we will learn how to use the pipeline to orchestrate components.

Configuring a pipeline

The configuration format Rasa NLU uses is YAML. Here is an example of a configuration file of Rasa NLU:

language: en
pipeline:
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  ...

The output of Rasa NLU

In order to properly debug Rasa NLU, developers should understand its output format.

The output format of Rasa NLU's inference is as follows:

{
  "text": "show me chinese restaurants",
  "intent": "restaurant_search",
  "entities": [
    {
      "start": 8,
      "end": 15,
      "value": "chinese",
      "entity": "cuisine",
      "extractor": "CRFEntityExtractor",
      "confidence": 0.854,
      "processors": []
    }
  ]
 }

It contains three main parts: text, intent, and entities. The text field is the raw text...

Training and running Rasa NLU

Rasa is a very cohesive framework. We can use the built-in command-line tools of Rasa that we already introduced in the first chapter to perform tasks such as model training and prediction.

Let's start with model training.

Training our models

We can start training models after we have configured the pipeline and got the training data. Rasa provides developers with commands that can help us train a model quickly. As long as we are using the official project structure, Rasa's commands are able to locate the configuration and data files.

The command for training a model is as follows:

rasa train nlu

This command will look for training data in the data path, use config.yml as the pipeline configuration, and save the model (a zipped file) into the models path with nlu- as the prefix of the model's name. The length of training time depends on the components used and the size of the training dataset. The log will be printed continuously...

Practice – building the NLU part of a medical bot

The best way to learn Rasa NLU is by practice. Here, we work on a project to build a simple NLU component for a medical domain chatbot. All the project files can be found under the directory named ch02 in the GitHub repository at https://github.com/PacktPublishing/Conversational-AI-with-RASA.

What are the features of our bot?

Our bot supports the following functions:

  • Recognize the intent in a medicine inquiry or hospital and department inquiry.
  • Extract entities for diseases and symptoms.
  • Simple greetings.

How can we implement our bot in Rasa?

Let's follow the official Rasa project structure:

.
 ├── config.yml
├── credentials.yml
├── data
│   └── nlu.yml
├── domain.yml
├── endpoints.yml
└── models

In this simple NLU project...

Summary

In this chapter, we discussed the NLU part of Rasa. We gave a detailed explanation of the NLU training data structure. We discussed the high-level architecture of pipelines and components. We stepped through an example NLU component of a medical bot. This is an important part of Rasa. At this point, as a reader, you should have understood the architecture of Rasa NLU and how to configure it. You should be able to perform model training and inference operations.

In the next chapter, we will introduce Rasa Core.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Conversational AI with Rasa
Published in: Oct 2021Publisher: PacktISBN-13: 9781801077057
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Xiaoquan Kong

Xiaoquan is a machine learning expert specializing in NLP applications. He has extensive experience in leading teams to build NLP platforms in several Fortune Global 500 companies. He is a Google developer expert in Machine Learning and has been actively involved in contributions to TensorFlow for many years. He also has actively contributed to the development of the Rasa framework since the early stage and became a Rasa Superhero in 2018. He manages the Rasa Chinese community and has also participated in the Chinese localization of TensorFlow documents as a technical reviewer.
Read more about Xiaoquan Kong

author image
Guan Wang

Guan is currently working on Al applications and research for the insurance industry. Prior to that, he was a machine learning researcher at several industry Al labs. He was raised and educated in Mainland China, lived in Hong Kong for 10 years before relocating to Singapore in 2020. Guan holds BSc degrees in Physics and Computer Science from Peking University, and an MPhil degree in Physics from HKUST. Guan is an active tech blogger and community contributor to open source projects including Rasa, receiving more than10,000 stars for his own projects on Github.
Read more about Guan Wang