Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Artificial Intelligence for Robotics - Second Edition

You're reading from  Artificial Intelligence for Robotics - Second Edition

Product type Book
Published in Mar 2024
Publisher Packt
ISBN-13 9781805129592
Pages 344 pages
Edition 2nd Edition
Languages
Concepts
Author (1):
Francis X. Govers III Francis X. Govers III
Profile icon Francis X. Govers III

Table of Contents (18) Chapters

Preface 1. Part 1: Building Blocks for Robotics and Artificial Intelligence
2. Chapter 1: The Foundation of Robotics and Artificial Intelligence 3. Chapter 2: Setting Up Your Robot 4. Chapter 3: Conceptualizing the Practical Robot Design Process 5. Part 2: Adding Perception, Learning, and Interaction to Robotics
6. Chapter 4: Recognizing Objects Using Neural Networks and Supervised Learning 7. Chapter 5: Picking Up and Putting Away Toys using Reinforcement Learning and Genetic Algorithms 8. Chapter 6: Teaching a Robot to Listen 9. Part 3: Advanced Concepts – Navigation, Manipulation, Emotions, and More
10. Chapter 7: Teaching the Robot to Navigate and Avoid Stairs 11. Chapter 8: Putting Things Away 12. Chapter 9: Giving the Robot an Artificial Personality 13. Chapter 10: Conclusions and Reflections 14. Answers 15. Index 16. Other Books You May Enjoy Appendix

Recognizing Objects Using Neural Networks and Supervised Learning

This is the chapter where we’ll start to combine robotics and artificial intelligence (AI) to accomplish some of the tasks we laid out so carefully in previous chapters. The subject of this chapter is object recognition – we will be teaching the robot to recognize what a toy is so that it can then decide what to pick up and what to leave alone. We will be using convolutional neural networks (CNNs) as machine learning tools for separating objects in images, recognizing them, and locating them in the camera frame so that the robot can then locate them. More specifically, we’ll be using images to recognize objects. We’ll be taking a picture and then looking to see whether the computer recognizes specific types of objects in those pictures. We won’t be recognizing objects themselves, but rather images or pictures of objects. We’ll also be putting bounding boxes around objects, separating...

Technical requirements

You will be able to accomplish all of this chapter’s tasks without a robot if yours cannot walk yet. We will, however, get better results if the camera is in the proper position on the robot. If you don’t have a robot, you can still do all of these tasks with a laptop and a USB camera.

Overall, here’s the hardware and software that you will need to complete the tasks in this chapter:

The source code for this chapter can be found at https://github.com/PacktPublishing/Artificial-Intelligence-for-Robotics-2e.

In the next section, we will discuss what image processing is.

A brief overview of image processing

Most of you will be very familiar with computer images, formats, pixel depths, and maybe even convolutions. We will be discussing these concepts in the following sections; if you already know this, skip ahead. If this is new territory, read carefully, because everything we’ll do after is based on this information.

Images are stored in a computer as a two-dimensional array of pixels or picture elements. Each pixel is a tiny dot. Thousands or millions of tiny dots make up each image. Each pixel is a number or series of numbers that describe its color. If the image is only a grayscale or black-and-white image, then each pixel is represented by a single number that corresponds to how dark or light the tiny dot is. This is straightforward so far.

If the image is a color picture, then each dot has three numbers that are combined to make its color. Usually, these numbers are the intensity of Red, Green, and Blue (RGB) colors. The combination...

Understanding our object recognition task

Having a computer or robot recognize an image of a toy is not as simple as taking two pictures and then saying if picture A = picture B, then toy. We are going to have to do quite a bit of work to be able to recognize a variety of objects that are randomly rotated, strewn about, and at various distances. We could write a program to recognize simple shapes – hexagons, for instance, or simple color blobs – but nothing as complex as a toy stuffed dog. Writing a program that did some sort of analysis of an image and computed the pixels, colors, distributions, and ranges of every possible permutation would be extremely difficult, and the result would be very fragile – it would fail at the slightest change in lighting or color.

Speaking from experience, I had a recent misadventure with a large robot that used a traditional computer vision system to find its battery charger station. That robot mistook an old, faded soft drink...

Image manipulation

So, now that we have an image, what can we do with it? You have probably played with Adobe Photoshop or some other image manipulation program such as GIMP, and you know that there are hundreds of operations, filters, changes, and tricks you can perform on images. For instance, can make an image brighter or darker by adjusting the brightness. We can increase the contrast between the white parts of the image and the dark parts. We can make an image blurry, usually by applying a Gaussian blur filter. We can also make an image sharper (somewhat) by using a filter such as an unsharp mask. You can also use an edge detector filter, such as the Canny filter, to isolate the edges of an image, where color or value changes. We will be using all of these techniques to help the computer identify images:

Figure 4.2 – Various convolutions applied to an image

Figure 4.2 – Various convolutions applied to an image

By performing these manipulations, we want the computer to not have the computer software...

Using YOLOv8 – an object recognition model

Before we dive into the details of the YOLOv8 model, let’s talk about why I selected it. First of all, the learning process is pretty much the same for any CNN we might use. YOLO is a strong open source object detection model with a lot of development behind it. It’s considered state of the art, and it already does what we need – it detects objects and shows us where they are in images by drawing bounding boxes around them. So, it tells us what objects are, and where they are located. As you will see, it is very easy to use and can be extended to detect other classes of objects other than what it was originally trained for. There are a lot of YOLO users out there who can provide a lot of support and a good basis for learning about AI object recognition for robotics.

As I mentioned at the beginning of this chapter, we have two tasks we need to accomplish to reach our goal of picking up toys with a robot. First...

Summary

In this chapter, we dove head-first into the world of ANNs. An ANN can be thought of as a stepwise non-linear approximation function that slowly adjusts itself to fit a curve that matches the desired input to the desired output. The learning process consists of several steps, including preparing data, labeling data, creating the network, initializing the weights, creating the forward pass that provides the output, and calculating the loss (also called the error). We created a special type of ANN, a CNN, to examine images. The network was trained using images with toys, to which we added bounding boxes to tell the network what part of the image was a toy. We trained the network to get an accuracy better than 87% in classifying images with toys in them. Finally, we tested the network to verify its output and tuned our results using the Adam adaptive descent algorithm.

In the next chapter, we will look at machine learning for the robot arm in terms of reinforcement learning...

Questions

  1. We went through a lot in this chapter. You can use the framework provided to investigate the properties of neural networks. Try several activation functions, or different settings for convolutions, to see what changes in the training process.
  2. Draw a diagram of an artificial neuron and label the parts. Look up a natural, human biological neuron and compare them.
  3. Which features of a real neuron and an artificial neuron are the same? Which ones are different?
  4. What effect does the learning rate have on gradient descent? What if the learning rate is too large? Too small?
  5. What relationship does the first layer of a neural network have with the input?
  6. What relationship does the last layer of a neural network have with the output?
  7. Look up three kinds of loss functions and describe how they work. Include mean square loss and the two kinds of cross-entropy loss.
  8. What would you change if your network was trained and reached 40% accuracy of the classification...

Further reading

For more information on the topics that were covered in this chapter, please refer to the following resources:

  • Python Deep Learning Cookbook, by Indra den Bakker, Packt Publishing, 2017
  • Artificial Intelligence with Python, by Prateek Joshi, Packt Publishing, 2017
  • Python Deep Learning, by Valentino Zocca, Gianmario Spacagna, Daniel Slater, and Peter Roelants, Packt Publishing, 2017
  • PyImageSearch Blog, by Adrian Rosebrock, available at pyimagesearch.com, 2018
lock icon The rest of the chapter is locked
You have been reading a chapter from
Artificial Intelligence for Robotics - Second Edition
Published in: Mar 2024 Publisher: Packt ISBN-13: 9781805129592
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}