Reader small image

You're reading from  Learn Robotics Programming - Second Edition

Product typeBook
Published inFeb 2021
PublisherPackt
ISBN-139781839218804
Edition2nd Edition
Concepts
Right arrow
Author (1)
Danny Staple
Danny Staple
author image
Danny Staple

Danny Staple builds robots and gadgets as a hobbyist, makes videos about his work with robots, and attends community events such as PiWars and Arduino Day. He has been a professional Python programmer, later moving into DevOps, since 2009, and a software engineer since 2000. He has worked with embedded systems, including embedded Linux systems, throughout the majority of his career. He has been a mentor at a local CoderDojo, where he taught how to code with Python. He has run Lego Robotics clubs with Mindstorms. He has also developed Bounce!, a visual programming language targeted at teaching code using the NodeMCU IoT platform. The robots he has built with his children include TankBot, SkittleBot (now the Pi Wars robot), ArmBot, and SpiderBot.
Read more about Danny Staple

Right arrow

Chapter 15: Voice Communication with a Robot Using Mycroft

Using our voice to ask a robot to do something and receiving a voice response has been seen as a sign of intelligence for a long time. Devices around us, such as those using Alexa and Google Assistant, have these tools. Being able to program our system to integrate with these tools gives us access to a powerful voice assistant system. Mycroft is a Python-based open source voice system. We will get this running on the Raspberry Pi by connecting it to a speaker and microphone, and then we will run instructions on our robot based on the words we speak.

In this chapter, we will have an overview of Mycroft and then learn how to add a speaker/microphone board to a Raspberry Pi. We will then install and configure a Raspberry Pi to run Mycroft.

We'll also extend our use of Flask programming, building a Flask API with more control points.

Toward the end of the chapter, we will create our own skills code to connect a voice...

Technical requirements

You will require the following hardware for this chapter:

  • An additional Raspberry Pi 4 (model B).
  • An SD card (at least 8 GB).
  • A PC that can write the card (with the balenaEtcher software).
  • The ReSpeaker 2-Mics Pi HAT.
  • Mini Audio Magnet Raspberry Pi Speaker—a tiny speaker with a JST connector or a speaker with a 3.5 mm jack.
  • It may be helpful to have a Micro-HDMI to HDMI cable for troubleshooting.
  • Micro USB power supply.
  • The robot from the previous chapters (after all, we intend to get this moving).

The code for this chapter is available on GitHub at https://github.com/PacktPublishing/Learn-Robotics-Programming-Second-Edition/tree/master/chapter15.

Check out the following video to see the Code in Action: https://bit.ly/2N5bXqr

Introducing Mycroft – understanding voice agent terminology

Mycroft is a software suite known as a voice assistant. Mycroft listens for voice commands and takes actions based on those commands. Mycroft code is written in Python and is open source and free. It performs most of its voice processing in the cloud. After the commands are processed, Mycroft will use a voice to respond to the human.

Mycroft is documented online and has a community of users. There are alternatives that you could consider after you've experimented with Mycroft – for example, Jasper, Melissa-AI, and Google Assistant.

So, what are the concepts of a voice assistant? Let's look at them in the following subsections.

Speech to text

Speech to text (STT) describes systems that take audio containing human speech and turn it into a series of words that a computer can then process.

These can run locally, or they can run in the cloud on far more powerful machines.

Wake words

...

Limitations of listening for speech on a robot

Before we start to build this, we should consider what we are going to make. Should the speaker and microphone be on the robot or somewhere else? Will the processing be local or in the cloud?

Here are some considerations to keep in mind:

  • Noise: A robot with motors is a noisy environment. Having a microphone anywhere near the motors will make it close to useless.
  • Power: The voice assistant is continuously listening. The robot has many demands for power already with the other sensors that are running on it. This power demand applies both in terms of battery power and the CPU power needed.
  • Size and physical location: The speaker and voice HAT would add height and wiring complications to an already busy robot.

A microphone and speaker combination could be on a stalk for a large robot – a tall standoff with a second Raspberry Pi there. But this is unsuitable for this small and simple robot. We will create a...

Adding sound input and output to the Raspberry Pi

Before we can use a voice processing/voice assistant, we need to give the Raspberry Pi some speakers and a microphone. A few Raspberry Pi add-ons provide this. My recommendation, with a microphone array (for better recognition) and a connection to speakers, is the ReSpeaker 2-Mics Pi HAT, which is widely available.

The next photograph shows the ReSpeaker 2-Mics Pi HAT:

Figure 15.1 – The ReSpeaker 2-Mics Pi HAT

Figure 15.1 shows a photo of a ReSpeaker 2-Mics Pi HAT mounted on a Raspberry Pi. On the left, I've labeled the left microphone. The hat has two microphones, which are two tiny rectangular metal parts on each side. The next label is for 3 RGB LEDs and a button connected to a GPIO pin. After this are the two ways of connecting speakers – a 3.5mm jack or a JST connector. I recommend you connect a speaker to hear output from this HAT. Then, the last label highlights the right microphone...

Programming a Flask API

This chapter aims to control our robot with Mycroft. To do so, we need to give our robot some way to receive commands from other systems. An Application Programming Interface (API) on a server lets us decouple systems like this to send commands across the network to another and receive a response. The Flask system is ideally suited to building this.

Web-based APIs have endpoints that other systems make their requests to and roughly map to functions or methods in a Python module. As you'll see, we map our API endpoints directly to functions in the Python robot_modes module.

Before we get into building much, let's look at the design of this thing – it will also reveal how Mycroft works.

Overview of Mycroft controlling the robot

The following diagram shows how a user controls a robot via Mycroft:

Figure 15.4 – Overview of the robot skill

The diagram in Figure 15.4 shows how data flows in this system...

Programming a voice agent with Mycroft on the Raspberry Pi

The robot backend provided by the Flask control system is good enough to create our Mycroft skill with.

In Figure 15.4, you saw that after you say something with the wake word, upon waking, Mycroft will transmit the sound you made to the Google STT system. Google STT will then return the text.

Mycroft will then match this against vocabulary files for the region you are in and match that with intents set up in the skills. Once matched, Mycroft will invoke an intent in a skill. Our robot skill has intents that will make network (HTTP) requests to the Flask control server we created for our robot. When the Flask server responds to say that it has processed the request (perhaps the behavior is started), the robot skill will choose a dialog to speak back to the user to confirm that it has successfully carried out the request or found a problem.

We'll start with a simple skill, with a basic intent, and then you can...

Summary

In this chapter, you learned about voice assistant terminology, speech to text, wake words, intents, skills, utterances, vocabulary, and dialog. You considered where you would install microphones and speakers and whether they should be on board a robot.

You then saw how to physically install a speaker/microphone combination onto a Raspberry Pi, then prepare software to get the Pi to use it. You installed Picroft – a Mycroft Raspbian environment, getting the voice agent software.

You were then able to play with Mycroft and get it to respond to different voice commands and register it with its base.

You then saw how to make a robot ready for an external agent, such as a voice agent to control it with a Flask API. You were able to create multiple skills that communicate with a robot, with a good starting point for creating more.

In the next chapter, we will bring back out the IMU we introduced in Chapter 12, IMU Programming with Python, and get it to do more...

Exercises

Try these exercises to get more out of this chapter and expand your experience:

  • Try installing some other Mycroft skills from the Mycroft site and playing with them. Hint: say Hey Mycroft, install pokemon.
  • The robot mode system has a flaw; it assumes that a process you've asked to stop does stop. Should it wait and check the return code to see if it has stopped?
  • An alternative way to implement the robot modes might be to update all the behaviors to exit cleanly so you could import them instead of running in subprocesses. How tricky would this be?
  • While testing the interactions, did you find the vocabulary wanting? Perhaps extend it with phrases you might find more natural to start the different behaviors. Similarly, you could make dialogs more interesting too.
  • Add more intents to the skill, for example, wall avoiding. You could add a stop intent, although the response time may make this less than ideal.
  • Could the RGB LEDs on the ReSpeaker...

Further reading

Please refer to the following for more information:

  • Raspberry Pi Robotic Projects, Dr. Richard Grimmett, Packt Publishing, has a chapter on providing speech input and output.
  • Voice User Interface Projects, Henry Lee, Packt Publishing, focuses entirely on voice interfaces to systems. It shows you how to build chatbots and applications with the Alexa and Google Home voice agents.
  • Mycroft AI – Introduction Voice Stack – a whitepaper from Mycroft AI gives more detail on how the Mycroft stack works and its components.
  • Mycroft has a large community that supports and discusses the technology at https://community.mycroft.ai/. I recommend consulting the troubleshooting information of this community. Mycroft is under active development and has both many quirks and many new features. It's also an excellent place to share skills you build for it.
  • Seeed Studio, the ReSpeaker 2-Mics Pi HAT creators, host documentation and code for this device...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Learn Robotics Programming - Second Edition
Published in: Feb 2021Publisher: PacktISBN-13: 9781839218804
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Danny Staple

Danny Staple builds robots and gadgets as a hobbyist, makes videos about his work with robots, and attends community events such as PiWars and Arduino Day. He has been a professional Python programmer, later moving into DevOps, since 2009, and a software engineer since 2000. He has worked with embedded systems, including embedded Linux systems, throughout the majority of his career. He has been a mentor at a local CoderDojo, where he taught how to code with Python. He has run Lego Robotics clubs with Mindstorms. He has also developed Bounce!, a visual programming language targeted at teaching code using the NodeMCU IoT platform. The robots he has built with his children include TankBot, SkittleBot (now the Pi Wars robot), ArmBot, and SpiderBot.
Read more about Danny Staple