You're reading from Artificial Intelligence for Robotics - Second Edition

Product typeBook

Published inMar 2024

PublisherPackt

ISBN-139781805129592

Edition2nd Edition

Concepts

Robotics

Author (1)

Francis X. Govers III

Teaching a Robot to Listen

Teaching a robot to listen to spoken instructions is a whole discipline in itself. It is not sufficient for the robot to just recognize individual words or some canned phrase. We want the robot to respond to normal spoken commands with a normal variety of phrasing. We might say, “Pick up the toys,” or “Please pick up all the toys,” or “Clean this mess up,” any of which would be a valid command to instruct the robot to begin searching the room for toys to pick up and put away. We will be using a variety of techniques and processes for this chapter. We are going to be building on an open source verbal assistant called Mycroft, an AI-based speech recognition and natural language processing (NLP) engine that can be programmed and extended by us. We will be adding some additional capability to Mycroft – we will use a technique I call the “fill in the blank” method of command processing to extract the...

Technical requirements

This chapter uses the following tools:

Mycroft Open Source Voice Assistant (http://mycroft.ai) – I had to build it from source from the GitHub repository (https://github.com/MycroftAI), so expect to do the same to keep it compatible with the Robot Operating System (ROS) we run the robot with.
Python 3.2.
You will need a GitHub account at https://github.com/.
I used a miniature USB speaker and microphone for this project, which worked very well with the Jetson. They can be found at https://www.amazon.com/gp/product/B08R95XJW8

The code used in this chapter can be found in the GitHub repository for this book at https://github.com/PacktPublishing/Artificial-Intelligence-for-Robotics-2e.

Exploring robot speech recognition with NLP

This is going to be a rather involved chapter, but all of the concepts are fairly easy to understand. We will end up with a very strong framework to build voice recognition and commands upon. Not only will you get a voice-based command system for a robot, but also a full-featured digital assistant that tells jokes. Let’s first quickly introduce NLP.

Briefly introducing the NLP concept

NLP is not just converting sound waves to written words (speech to text, or STT), but also understanding what those words mean. We don’t want to just have some rigid, pre-programmed spoken commands, but some ability for the robot to respond to human speech.

We will be using two different forms of STT processing:

Spectrum analysis: This type helps to detect when you say the robot’s name. This technique recognizes words or phrases by sampling which frequencies and amplitudes make up the word. This process has the advantage...

Programming our robot

As discussed earlier in this chapter, Mycroft is a version of a digital assistant similar to Siri from Apple or Alexa from Amazon in that it can listen to voice commands in a mostly normal fashion and interface those commands to a computer. We are using it because it has an interface that runs on a Jetson Nano 3. In this section, we will be setting up our hardware and our software (i.e., Mycroft).

Setting up the hardware

We will be installing Mycroft on Nvidia Jetson Nano (or whatever microprocessor you’re using). One of the few things that the Jetson Nano did not come with is audio capability. It has no speakers or microphones. I found that a quick and effective way to add that capability was to use an existing hardware kit that provided both a very high-quality speaker and an excellent set of stereo microphones in a robot-friendly form factor. Note that this works with pretty much any Linux single-board computer (SBC).

The kit is a miniature...

Summary

This chapter introduced NLP for robotics and concentrated on developing a natural language interface for the robot that accomplished three tasks: starting the pick up toys process, telling knock-knock jokes, and listening to knock-knock jokes.

The concepts introduced included recognizing words by phonemes, turning phonemes into graphemes and graphemes into words, parsing intent from sentences, and executing computer programs with a voice interface. We introduced the open source AI engine, Mycroft, which is an AI-based voice assistant program that runs on the Jetson Nano. We also wrote a joke database to entertain small children with some very simple dialog.

In the next chapter, we’ll be learning about robot navigation using landmarks, neural networks, obstacle avoidance, and machine learning.

Questions

Do some internet research on why the AI engine was named Mycroft. How many different stories did you find, and which one did you like?
In the discussion of intent, how would you design a neural network to predict command intent from natural language sentences?
Rewrite “Receive knock-knock jokes” to remember the jokes told to the robot by adding them to the joke database used by the “tell knock knock jokes” program. Is this machine learning?
Modify the “tell jokes” program to play sounds from a wave file, such as a music clip, as well as doing TTS.
The sentence structures used in this chapter are all based on English grammar. Other languages, such as French and Japanese, have different structures. How does that change the parsing of sentences? Would the program we wrote be able to understand Yoda?
Do you think that Mycroft’s Intent Engine is actually understanding intent, or just pulling out keywords...

Python Natural Language Processing by Jalaj Thanaki, Packt Publishing
Artificial Intelligence with Python by Prateek Joshi, Packt Publishing
Mycroft tutorial for developing skills is located at https://mycroft.gitbook.io/mycroft-docs/developing_a_skill/introduction-developing-skills
Additional documentation for using Mycroft is located at https://media.readthedocs.org/pdf/mycroft-core/stable/mycroft-core.pdf

The rest of the chapter is locked

You have been reading a chapter from

Artificial Intelligence for Robotics - Second Edition

Published in: Mar 2024Publisher: PacktISBN-13: 9781805129592

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Francis X. Govers III

Francis X. Govers III is an Associate Technical Fellow for Autonomy at Bell Textron, and chairman of the Textron Autonomy Council. He is the designer of over 30 unmanned vehicles and robots for land, sea, air, and space, including RAMSEE, the autonomous security guard robot. Francis helped lead the design of the International Space Station, the F-35 JSF Fighter, the US Army Future Combat Systems, and telemetry systems for NASCAR and IndyCar. He is an engineer, pilot, author, musician, artist, and maker. He received five outstanding achievement awards from NASA and recognition from Scientific American for World Changing Ideas. He has a Master of Science degree from Brandeis University and is a veteran of the US Air Force.
Read more about Francis X. Govers III