2. Neural Networks
Overview
This chapter starts with an introduction to biological neurons; we see how an artificial neural network is inspired by biological neural networks. We will examine the structure and inner workings of a simple single-layer neuron called a perceptron and learn how to implement it in TensorFlow. We will move on to building multilayer neural networks to solve more complex multiclass classification tasks and discuss the practical considerations of designing a neural network. As we build deep neural networks, we will move on to Keras to build modular and easy-to-customize neural network models in Python. By the end of this chapter, you'll be adept at building neural networks to solve complex problems.
Introduction
In the previous chapter, we learned how to implement basic mathematical concepts such as quadratic equations, linear algebra, and matrix multiplication in TensorFlow. Now that we have learned the basics, let's dive into Artificial Neural Networks (ANNs), which are central to artificial intelligence and deep learning.
Deep learning is a subset of machine learning. In supervised learning, we often use traditional machine learning techniques, such as support vector machines or tree-based models, where features are explicitly engineered by humans. However, in deep learning, the model explores and identifies the important features of a labeled dataset without human intervention. ANNs, inspired by biological neurons, have a layered representation, which helps them learn labels incrementally—from the minute details to the complex ones. Consider the example of image recognition: in a given image, an ANN would just as easily identify basic details such as light and...
Neural Networks and the Structure of Perceptrons
A neuron is a basic building block of the human nervous system, which relays electric signals across the body. The human brain consists of billions of interconnected biological neurons, and they are constantly communicating with each other by sending minute electrical binary signals by turning themselves on or off. The general meaning of a neural network is a network of interconnected neurons. In the current context, we are referring to ANNs, which are actually modeled on a biological neural network. The term artificial intelligence is derived from the fact that natural intelligence exists in the human brain (or any brain for that matter), and we humans are trying to simulate this natural intelligence artificially. Though ANNs are inspired by biological neurons, some of the advanced neural network architectures, such as CNNs and RNNs, do not actually mimic the behavior of a biological neuron. However, for ease of understanding, we will...
Training a Perceptron
To train a perceptron, we need the following components:
- Data representation
- Layers
- Neural network representation
- Loss function
- Optimizer
- Training loop
In the previous section, we covered most of the preceding components: the data representation of the input data and the true labels in TensorFlow. For layers, we have the linear layer and the activation functions, which we saw in the form of the net input function and the sigmoid function respectively. For the neural network representation, we made a function called perceptron()
, which uses a linear layer and a sigmoid layer to perform predictions. What we did in the previous section using input data and initial weights and biases is called forward propagation. The actual neural network training involves two stages: forward propagation and backward propagation. We will explore them in detail in the next few steps. Let's look at the training process at a higher level:
- ...
Keras as a High-Level API
In TensorFlow 1.0, there were several APIs, such as Estimator, Contrib, and layers. In TensorFlow 2.0, Keras is very tightly integrated with TensorFlow, and it provides a high-level API that is user-friendly, modular, composable, and easy to extend in order to build and train deep learning models. This also makes developing code for neural networks much easier. Let's see how it works.
Exercise 2.05: Binary Classification Using Keras
In this exercise, we will implement a very simple binary classifier with a single neuron using the Keras API. We will use the same data.csv
file that we used in Exercise 2.02, Perceptron as a Binary Classifier:
Note
The dataset can be downloaded from GitHub by accessing the following GitHub link: https://packt.live/2BVtxIf.
- Import the required libraries:
import tensorflow as tf import pandas as pd import matplotlib.pyplot as plt %matplotlib inline # Import Keras libraries from tensorflow.keras.models import...
Exploring the Optimizers and Hyperparameters of Neural Networks
Training a neural network to get good predictions requires tweaking a lot of hyperparameters such as optimizers, activation functions, the number of hidden layers, the number of neurons in each layer, the number of epochs, and the learning rate. Let's go through each of them one by one and discuss them in detail.
Gradient Descent Optimizers
In an earlier section titled Perceptron Training Process in TensorFlow, we briefly touched upon the gradient descent optimizer without going into the details of how it works. This is a good time to explore the gradient descent optimizer in a little more detail. We will provide an intuitive explanation without going into the mathematical details.
The gradient descent optimizer's function is to minimize the loss or error. To understand how gradient descent works, you can think of this analogy: imagine a person at the top of a hill who wants to reach the bottom...
Activity 2.01: Build a Multilayer Neural Network to Classify Sonar Signals
In this activity, we will use the Sonar dataset (https://archive.ics.uci.edu/ml/datasets/Connectionist+Bench+(Sonar,+Mines+vs.+Rocks)), which has patterns obtained by bouncing sonar signals off a metal cylinder at various angles and under various conditions. You will build a neural network-based classifier to classify between sonar signals bounced off a metal cylinder (the Mine class), and those bounced off a roughly cylindrical rock (the Rock class). We recommend using the Keras API to make your code more readable and modular, which will allow you to experiment with different parameters easily:
Note
You can download the sonar dataset from this link https://packt.live/31Xtm9M.
- The first step is to understand the data so that you can figure out whether this is a binary classification problem or a multiclass classification problem.
- Once you understand the data and the type of classification that...
Summary
In this chapter, we started off by looking at biological neurons and then moved on to artificial neurons. We saw how neural networks work and took a practical approach to building single-layer and multilayer neural networks to solve supervised learning tasks. We looked at how a perceptron works, which is a single unit of a neural network, all the way to a deep neural network capable of performing multiclass classification. We saw how Keras makes it very easy to create deep neural networks with a minimal amount of code. Lastly, we looked at practical considerations to take into account when building a successful neural network, which involved important concepts such as gradient descent optimizers, overfitting, and dropout.
In the next chapter, we will go to the next level and build a more complicated neural network called a CNN, which is widely used in image recognition.