You're reading from Learning OpenCV 4 Computer Vision with Python 3 - Third Edition

Product typeBook

Published inFeb 2020

Reading LevelIntermediate

PublisherPackt

ISBN-139781789531619

Edition3rd Edition

Languages

Python

Tools

OpenCV

Concepts

Computer Vision

Authors (2):

Joseph Howse

Joe Minichino

View More author details

Introduction to Neural Networks with OpenCV

This chapter introduces a family of machine learning models called artificial neural networks (ANNs), or sometimes just neural networks. A key characteristic of these models is that they attempt to learn relationships among variables in a multi-layered fashion; they learn multiple functions to predict intermediate results before combining these into a single function to predict something meaningful (such as the class of an object). Recent versions of OpenCV contain an increasing amount of functionality related to ANNs – and, in particular, ANNs with many layers, called deep neural networks (DNNs). We will experiment with both shallower ANNs and DNNs in this chapter.

We have already gained some exposure to machine learning in other chapters – especially in Chapter 7, Building Custom Object Detectors, where we developed a...

Technical requirements

This chapter uses Python, OpenCV, and NumPy. Please refer to Chapter 1, Setting Up OpenCV, for installation instructions.

The completed code and sample videos for this chapter can be found in this book's GitHub repository, https://github.com/PacktPublishing/Learning-OpenCV-4-Computer-Vision-with-Python-Third-Edition, in the chapter10 folder.

Understanding ANNs

Let's define ANNs in terms of their basic role and components. Although much of the literature on ANNs emphasizes the idea that they are biologically inspired by the way neurons connect in a brain, we don't need to be biologists or neuroscientists to understand the fundamental concepts of an ANN.

First of all, an ANN is a statistical model. What is a statistical model? A statistical model is a pair of elements, namely the space S (a set of observations) and the probability, P, where P is a distribution that approximates S (in other words, a function that would generate a set of observations that is very similar to S).

Here are two different ways to think of P:

P is a simplification of a complex scenario.
P is the function that generated S in the first place, or at the very least a set of observations very similar to S.

Thus, ANNs are models that...

Training a basic ANN in OpenCV

OpenCV provides a class, cv2.ml_ANN_MLP, that implements an ANN as a multi-layer perceptron (MLP). This is exactly the kind of model we described earlier, in the Understanding neurons and perceptrons section.

To create an instance of cv2.ml_ANN_MLP, and to format data for this ANN's training and use, we rely on functionality in OpenCV's machine learning module, cv2.ml. As you may recall, this is the same module that we used for SVM-related functionality in Chapter 7, Building Custom Object Detectors. Moreover, cv2.ml_ANN_MLP and cv2.ml_SVM share a common base class called cv2.ml_StatModel. Therefore, you will find that OpenCV provides similar APIs for ANNs and SVMs.

Let's examine a dummy example as a gentle introduction to ANNs. This example will use completely meaningless data, but it will show us the basic API for training and...

Training an ANN classifier in multiple epochs

Let's create an ANN that attempts to classify animals based on three measurements: weight, length, and number of teeth. This is, of course, a mock scenario. Realistically, no one would describe an animal with just these three statistics. However, our intent is to improve our understanding of ANNs before we start applying them to image data.

Compared to the minimal example in the previous section, our animal classification mock-up will be more sophisticated in the following ways:

We will increase the number of neurons in the hidden layer.
We will use a larger training dataset. For convenience, we will generate this dataset pseudorandomly.
We will train the ANN in multiple epochs, meaning that we will train and retrain it multiple times with the same dataset each time.

The number of neurons in the hidden layer is an important...

Recognizing handwritten digits with an ANN

A handwritten digit is any of the 10 Arabic numerals (0 to 9), written manually with a pen or pencil, as opposed to being printed by a machine. The appearance of handwritten digits can vary significantly. Different people have different handwriting, and – with the possible exception of a skilled calligrapher – a person does not produce identical digits every time he or she writes. This variability means that the visual recognition of handwritten digits is a non-trivial problem for machine learning. Indeed, students and researchers in machine learning often test their skills and new algorithms by attempting to train an accurate recognizer for handwritten digits. We will approach this challenge in the following manner:

Load data from a Python-friendly version of the MNIST database. This is a widely used database containing...

Using DNNs from other frameworks in OpenCV

OpenCV can load and use DNNs that have been trained in any of the following frameworks:

Caffe (http://caffe.berkeleyvision.org/)
TensorFlow (https://www.tensorflow.org/)
Torch (http://torch.ch/)
Darknet (https://pjreddie.com/darknet/)
ONNX (https://onnx.ai/)
DLDT (https://github.com/opencv/dldt/)

The Deep Learning Deployment Toolkit (DLDT) is part of Intel's OpenVINO Toolkit (https://software.intel.com/openvino-toolkit/) for computer vision. DLDT provides tools for optimizing DNNs from other frameworks and for converting them into a common format. A collection of DLDT-compatible models is freely available in a repository called the Open Model Zoo (https://github.com/opencv/open_model_zoo/). DLDT, the Open Model Zoo, and OpenCV have some of the same people on their development teams; all three of these projects are sponsored by...

Detecting and classifying objects with third-party DNNs

For this demo, we are going to capture frames from a webcam in real-time and use a DNN to detect and classify 20 kinds of objects that may be in any given frame. Yes, a single DNN can do all this in real-time on a typical laptop that a programmer might use!

Before delving into the code, let's introduce the DNN that we will use. It is a Caffe version of a model called MobileNet-SSD, which uses a hybrid of a framework from Google called MobileNet and another framework called Single Shot Detector (SSD) MultiBox. The latter framework has a GitHub repository at https://github.com/weiliu89/caffe/tree/ssd/. The training technique for the Caffe version of MobileNet-SSD is provided by a project on GitHub at https://github.com/chuanqi305/MobileNet-SSD/. Copies of the following MobileNet-SSD files can be found in this book&apos...

Detecting and classifying faces with third-party DNNs

For this demonstration, we are going to use one DNN to detect faces and two other DNNs to classify the age and gender of each detected face. Specifically, we will use pre-trained Caffe models that are stored in the following files in the chapter10/faces_data folder of this book's GitHub repository.

Here is an inventory of the files in this folder, and of the files' origins:

detection/res10_300x300_ssd_iter_140000.caffemodel: This is the DNN for face detection. The OpenCV team has provided this file at https://github.com/opencv/opencv_3rdparty/blob/dnn_samples_face_detector_20170830/res10_300x300_ssd_iter_140000.caffemodel. This Caffe model was trained with the SSD framework (https://github.com/weiliu89/caffe/tree/ssd/). Thus, its topology is similar to the MobileNet-SSD model that we used in the previous section...

Summary

This chapter scratched the surface of the vast and fascinating world of ANNs. We learned about the structure of ANNs, and how to design a network topology based on application requirements. Then, we focused on OpenCV's implementation of MLP ANNs, as well as on OpenCV's support for diverse DNNs that have been trained in other frameworks.

We applied neural networks to real-world problems: notably, handwritten digit recognition; object detection and classification; and a combination of face detection, age classification, and gender classification in real time. We saw that even in these introductory demos, neural networks show a lot of promise in terms of versatility, accuracy, and speed. Hopefully, this encourages you to try out pre-trained models from various authors, and to learn to train advanced models of your own in various frameworks.

With this thought,...

The rest of the chapter is locked

You have been reading a chapter from

Learning OpenCV 4 Computer Vision with Python 3 - Third Edition

Published in: Feb 2020Publisher: PacktISBN-13: 9781789531619

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (2)

Joseph Howse

Joseph Howse lives in a Canadian fishing village, where he chats with his cats, crafts his books, and nurtures an orchard of hardy fruit trees. He is President of Nummist Media Corporation, which exists to support his books and to provide mentoring and consulting services, with a specialty in computer vision. On average, in 2015-2022, Joseph has written 1.4 new books or new editions per year for Packt. He also writes fiction, including an upcoming novel about the lives of a group of young people in the last days of the Soviet Union.
Read more about Joseph Howse

Joe Minichino

Joe Minichino is an R&D labs engineer at Teamwork. He is a passionate programmer who is immensely curious about programming languages and technologies and constantly experimenting with them. Born and raised in Varese, Lombardy, Italy, and coming from a humanistic background in philosophy (at Milan's Università Statale), Joe has lived in Cork, Ireland, since 2004. There, he became a computer science graduate at the Cork Institute of Technology.
Read more about Joe Minichino

Other recommended products

Related to this chapter

OpenCV 3 Computer Vision with Python Cookbook

OpenCV 3 is a native cross-platform library for computer vision, machine learning, and image processing. OpenCV's convenient high-level APIs hide very powerful internals designed for computational efficiency that can take advantage of multicore and GPU processing. This book will help you tackle increasingly challenging computer vision problems by providing a number of recipes that you can use to improve your applications.

BookMar 2018306 pages

OpenCV 3.x with Python By Example

Computer vision is found everywhere in modern technology. OpenCV for Python enables us to run computer vision algorithms in real time. With the advent of powerful machines, we have more processing power to work with. Using this technology, we can seamlessly integrate our computer vision applications into the cloud. Focusing on OpenCV 3.x and Python 3.6, this book will walk you through all the building blocks needed to build amazing computer vision applications with ease.

BookJan 2018268 pages

OpenCV 4 for Secret Agents

OpenCV 4 for Secret Agents is an updated edition of the book that introduced thousands of developers to cat face detection, real-time Eulerian video magnification, and other scintillating topics in computer vision. Now, Python 3 and Android Studio are supported. With an applied approach and a love of storytelling, the author presents projects that will appeal to all you tinkers, tailors, mad scientists, and spies.

BookApr 2019336 pages

Computer Vision with Python 3

The field of computer vision involves designing and implementing algorithms to understand images and extract meaningful information from them. This book enables you to build real-world applications using Python and open source image processing libraries.

BookAug 2017206 pages

Hands-On Algorithms for Computer Vision

The field of Computer Vision has seen advancements in terms of processing power and performance. Many algorithms are introduced to perform Computer Vision tasks efficiently. This book is a starting point for anyone interested in this field and wants to dig deeper into the most practical algorithms used by professional Computer Vision developers.

BookJul 2018290 pages

Mastering OpenCV 4 with Python

Mastering OpenCV 4 with Python is a comprehensive guide to help you to get acquainted with various computer vision algorithms running in real-time. This book will help you to build complete projects on image processing, motion detection, and image segmentation where you can gain advanced computer vision techniques.

BookMar 2019532 pages

The Computer Vision Workshop

With The Computer Vision Workshop, you’ll explore the basic and advanced techniques in video and image processing using OpenCV and Python. It is filled with real-world exercises and activities that will make the learning process easy and enjoyable.

BookJul 2020568 pages

OpenCV 4 Computer Vision Application Programming Cookbook

This book will present a variety of CV algorithms using the standard library. It will implement any shortfall that might come in CV by practicing the recipes that implement various tasks such as image processing and object recognition among others. It will help you in implementing CV algorithms to meet the technical requirement of your projects.

BookMay 2019494 pages

OpenCV 3 Computer Vision Application Programming Cookbook

BookFeb 2017474 pages

Practical Computer Vision

Computer Vision is a broadly used term associated with acquiring, processing, and analyzing images. This book will show you how you can perform various Computer Vision techniques in the most practical way possible. Right from capturing images from various sources, you will learn how to perform image filtering/manipulation and detect features in your images. As you go through the chapters, you'll work with increasingly complex algorithms to develop complex Computer Vision applications

BookFeb 2018234 pages

Raspberry Pi Computer Vision Programming

You will learn the basics of hardware and software required for image processing and computer vision with Raspberry Pi and Python 3. You will have a look at all the major image processing, manipulation, and computer vision techniques and algorithms in detail using engaging examples. You will build a lot of real-life computer vision applications.

BookJun 2020306 pages5

Hands-On GPU-Accelerated Computer Vision with OpenCV and CUDA

This book is a guide to explore how accelerating of computer vision applications using GPUs will help you develop algorithms that work on complex image data in real time. It will solve the problems you face while deploying these algorithms on embedded platforms with the help of development boards from NVIDIA such as the Jetson TX1, Jetson TX2, and Jetson TK1.

BookSep 2018380 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages