You're reading from Artificial Intelligence for Robotics - Second Edition

Product type Book

Published in Mar 2024

Publisher Packt

ISBN-13 9781805129592

Pages 344 pages

Edition 2nd Edition

Languages

Concepts

Robotics

Author (1):

Francis X. Govers III

Table of Contents (18) Chapters

Preface

1. Part 1: Building Blocks for Robotics and Artificial Intelligence

2. Chapter 1: The Foundation of Robotics and Artificial Intelligence

3. Chapter 2: Setting Up Your Robot

4. Chapter 3: Conceptualizing the Practical Robot Design Process

5. Part 2: Adding Perception, Learning, and Interaction to Robotics

6. Chapter 4: Recognizing Objects Using Neural Networks and Supervised Learning

7. Chapter 5: Picking Up and Putting Away Toys using Reinforcement Learning and Genetic Algorithms

8. Chapter 6: Teaching a Robot to Listen

9. Part 3: Advanced Concepts – Navigation, Manipulation, Emotions, and More

10. Chapter 7: Teaching the Robot to Navigate and Avoid Stairs

11. Chapter 8: Putting Things Away

12. Chapter 9: Giving the Robot an Artificial Personality

13. Chapter 10: Conclusions and Reflections

14. Answers

15. Index

Why subscribe?

16. Other Books You May Enjoy

Appendix

Recognizing Objects Using Neural Networks and Supervised Learning

This is the chapter where we’ll start to combine robotics and artificial intelligence (AI) to accomplish some of the tasks we laid out so carefully in previous chapters. The subject of this chapter is object recognition – we will be teaching the robot to recognize what a toy is so that it can then decide what to pick up and what to leave alone. We will be using convolutional neural networks (CNNs) as machine learning tools for separating objects in images, recognizing them, and locating them in the camera frame so that the robot can then locate them. More specifically, we’ll be using images to recognize objects. We’ll be taking a picture and then looking to see whether the computer recognizes specific types of objects in those pictures. We won’t be recognizing objects themselves, but rather images or pictures of objects. We’ll also be putting bounding boxes around objects, separating...

Technical requirements

You will be able to accomplish all of this chapter’s tasks without a robot if yours cannot walk yet. We will, however, get better results if the camera is in the proper position on the robot. If you don’t have a robot, you can still do all of these tasks with a laptop and a USB camera.

Overall, here’s the hardware and software that you will need to complete the tasks in this chapter:

Hardware:
- A laptop computer
- Nvidia Jetson Nano
- USB camera
Software:
- Python 3
- OpenCV2
- TensorFlow
- YOLOv8, which is available at https://github.com/ultralytics/ultralytics

The source code for this chapter can be found at https://github.com/PacktPublishing/Artificial-Intelligence-for-Robotics-2e.

In the next section, we will discuss what image processing is.

A brief overview of image processing

Most of you will be very familiar with computer images, formats, pixel depths, and maybe even convolutions. We will be discussing these concepts in the following sections; if you already know this, skip ahead. If this is new territory, read carefully, because everything we’ll do after is based on this information.

Images are stored in a computer as a two-dimensional array of pixels or picture elements. Each pixel is a tiny dot. Thousands or millions of tiny dots make up each image. Each pixel is a number or series of numbers that describe its color. If the image is only a grayscale or black-and-white image, then each pixel is represented by a single number that corresponds to how dark or light the tiny dot is. This is straightforward so far.

If the image is a color picture, then each dot has three numbers that are combined to make its color. Usually, these numbers are the intensity of Red, Green, and Blue (RGB) colors. The combination...

Understanding our object recognition task

Having a computer or robot recognize an image of a toy is not as simple as taking two pictures and then saying if picture A = picture B, then toy. We are going to have to do quite a bit of work to be able to recognize a variety of objects that are randomly rotated, strewn about, and at various distances. We could write a program to recognize simple shapes – hexagons, for instance, or simple color blobs – but nothing as complex as a toy stuffed dog. Writing a program that did some sort of analysis of an image and computed the pixels, colors, distributions, and ranges of every possible permutation would be extremely difficult, and the result would be very fragile – it would fail at the slightest change in lighting or color.

Speaking from experience, I had a recent misadventure with a large robot that used a traditional computer vision system to find its battery charger station. That robot mistook an old, faded soft drink...

Image manipulation

So, now that we have an image, what can we do with it? You have probably played with Adobe Photoshop or some other image manipulation program such as GIMP, and you know that there are hundreds of operations, filters, changes, and tricks you can perform on images. For instance, can make an image brighter or darker by adjusting the brightness. We can increase the contrast between the white parts of the image and the dark parts. We can make an image blurry, usually by applying a Gaussian blur filter. We can also make an image sharper (somewhat) by using a filter such as an unsharp mask. You can also use an edge detector filter, such as the Canny filter, to isolate the edges of an image, where color or value changes. We will be using all of these techniques to help the computer identify images:

Figure 4.2 – Various convolutions applied to an image

By performing these manipulations, we want the computer to not have the computer software...

Using YOLOv8 – an object recognition model

Before we dive into the details of the YOLOv8 model, let’s talk about why I selected it. First of all, the learning process is pretty much the same for any CNN we might use. YOLO is a strong open source object detection model with a lot of development behind it. It’s considered state of the art, and it already does what we need – it detects objects and shows us where they are in images by drawing bounding boxes around them. So, it tells us what objects are, and where they are located. As you will see, it is very easy to use and can be extended to detect other classes of objects other than what it was originally trained for. There are a lot of YOLO users out there who can provide a lot of support and a good basis for learning about AI object recognition for robotics.

As I mentioned at the beginning of this chapter, we have two tasks we need to accomplish to reach our goal of picking up toys with a robot. First...

Summary

In this chapter, we dove head-first into the world of ANNs. An ANN can be thought of as a stepwise non-linear approximation function that slowly adjusts itself to fit a curve that matches the desired input to the desired output. The learning process consists of several steps, including preparing data, labeling data, creating the network, initializing the weights, creating the forward pass that provides the output, and calculating the loss (also called the error). We created a special type of ANN, a CNN, to examine images. The network was trained using images with toys, to which we added bounding boxes to tell the network what part of the image was a toy. We trained the network to get an accuracy better than 87% in classifying images with toys in them. Finally, we tested the network to verify its output and tuned our results using the Adam adaptive descent algorithm.

In the next chapter, we will look at machine learning for the robot arm in terms of reinforcement learning...

Questions

We went through a lot in this chapter. You can use the framework provided to investigate the properties of neural networks. Try several activation functions, or different settings for convolutions, to see what changes in the training process.
Draw a diagram of an artificial neuron and label the parts. Look up a natural, human biological neuron and compare them.
Which features of a real neuron and an artificial neuron are the same? Which ones are different?
What effect does the learning rate have on gradient descent? What if the learning rate is too large? Too small?
What relationship does the first layer of a neural network have with the input?
What relationship does the last layer of a neural network have with the output?
Look up three kinds of loss functions and describe how they work. Include mean square loss and the two kinds of cross-entropy loss.
What would you change if your network was trained and reached 40% accuracy of the classification...

You're reading from Artificial Intelligence for Robotics - Second Edition

Table of Contents (18) Chapters

Recognizing Objects Using Neural Networks and Supervised Learning

Technical requirements

A brief overview of image processing

Understanding our object recognition task

Image manipulation

Using YOLOv8 – an object recognition model

Summary

Questions

Further reading

Authors (1)

Personalised recommendations for you

You're reading from Artificial Intelligence for Robotics - Second Edition

Table of Contents (18) Chapters

Recognizing Objects Using Neural Networks and Supervised Learning

Technical requirements

A brief overview of image processing

Understanding our object recognition task

Image manipulation

Using YOLOv8 – an object recognition model

Summary

Questions

Further reading

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you