Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Deep Learning for Beginners

You're reading from  Deep Learning for Beginners

Product type Book
Published in Sep 2020
Publisher Packt
ISBN-13 9781838640859
Pages 432 pages
Edition 1st Edition
Languages
Author (1):
Dr. Pablo Rivas Dr. Pablo Rivas
Profile icon Dr. Pablo Rivas

Table of Contents (20) Chapters

Preface Section 1: Getting Up to Speed
Introduction to Machine Learning Setup and Introduction to Deep Learning Frameworks Preparing Data Learning from Data Training a Single Neuron Training Multiple Layers of Neurons Section 2: Unsupervised Deep Learning
Autoencoders Deep Autoencoders Variational Autoencoders Restricted Boltzmann Machines Section 3: Supervised Deep Learning
Deep and Wide Neural Networks Convolutional Neural Networks Recurrent Neural Networks Generative Adversarial Networks Final Remarks on the Future of Deep Learning Other Books You May Enjoy
Training a Single Neuron

After revising the concepts around learning from data, we will now pay close attention to an algorithm that trains one of the most fundamental neural-based models: the perceptron. We will look at the steps required for the algorithm to function, and the stopping conditions. This chapter will present the perceptron model as the first model that represents a neuron, which aims to learn from data in a simple manner. The perceptron model is key to understanding basic and advanced neural models that learn from data. In this chapter, we will also cover the problems and considerations associated with non-linearly separable data.

Upon completion of the chapter, you should feel comfortable discussing the perceptron model, and applying its learning algorithm. You will be able to implement the algorithm over both linearly and non-linearly separable data.

Specifically...

The perceptron model

Back in Chapter 1, Introduction to Machine Learning, we briefly introduced the basic model of a neuron and the perceptron learning algorithm (PLA). Here, in this chapter, we will now revisit and expand the concept and show how that is coded in Python. We will begin with the basic definition.

The visual concept

The perceptron is an analogy of a human-inspired information processing unit, originally conceived by F. Rosenblatt and depicted in Figure 5.1 (Rosenblatt, F. (1958)). In the model, the input is represented with the vector , the activation of the neuron is given by the function , and the output is . The parameters of the neuron are and :

Figure 5.1 – The basic model of a perceptron

The trainable parameters of a perceptron are , and they are unknown. Thus, we can use input training data to determine these parameters using the PLA. From Figure 5.1, multiplies , then multiplies , and is multiplied by 1; all these products are added and then passed...

The perceptron learning algorithm

The perceptron learning algorithm (PLA) is the following:

Input: Binary class dataset

  • Initialize to zeros, and iteration counter
  • While there are any incorrectly classified examples:
  • Pick an incorrectly classified example, call it , whose true label is
  • Update as follows:
  • Increase iteration counter, , and repeat
Return:

Now, let's see how this takes form in Python.

PLA in Python

Here is an implementation in Python that we will discuss part by part, while some of it has already been discussed:

N = 100 # number of samples to generate
random.seed(a = 7) # add this to achieve for reproducibility

X, y = make_classification(n_samples=N, n_features=2, n_classes=2,
n_informative=2, n_redundant=0, n_repeated=0,
n_clusters_per_class=1, class_sep=1.2,
random_state=5)

y[y==0] = -1

X_train = np.append(np.ones((N,1)), X, 1) # add a column of ones

# initialize the weights to zeros
w...

A perceptron over non-linearly separable data

As we have discussed before, a perceptron will find a solution in finite time if the data is separable. However, how many iterations it will take to find a solution depends on how close the groups are to each other in the feature space.

Convergence is when the learning algorithm finds a solution or reaches a steady state that is acceptable to the designer of the learning model.

The following paragraphs will deal with convergence on different types of data: linearly separable and non-linearly separable.

Convergence on linearly separable data

For the particular dataset that we have been studying in this chapter, the separation between the two groups of data is a parameter that can be varied (this is usually a problem with real data). The parameter is class_sep and can take on a real number; for example:

X, y = make_classification(..., class_sep=2.0, ...)

This allows us to study how many iterations it takes, on average, for the perceptron algorithm...

Summary

This chapter presented an overview of the classic perceptron model. We covered the theoretical model and its implementation in Python for both linearly and non-linearly separable datasets. At this point, you should feel confident that you know enough about the perceptron that you can implement it yourself. You should be able to recognize the perceptron model in the context of a neuron. Also, you should now be able to implement a pocket algorithm and early termination strategies in a perceptron, or any other learning algorithm in general.

Since the perceptron is the most essential element that paved the way for deep neural networks, after we have covered it here, the next step is to go to Chapter 6, Training Multiple Layers of Neurons. In that chapter, you will be exposed to the challenges of deep learning using the multi-layer perceptron algorithm, such as gradient descent techniques for error minimization, and hyperparameter optimization to achieve generalization. But before...

Questions and answers

  1. What is the relationship between the separability of the data and the number of iterations of the PLA?

The number of iterations can grow exponentially as the data groups get close to one another.

  1. Will the PLA always converge?

Not always, only for linearly separable data.

  1. Can the PLA converge on non-linearly separable data?

No. However, you can find an acceptable solution by modifying it with the pocket algorithm, for example.

  1. Why is the perceptron important?

Because it is one of the most fundamental learning strategies that has helped conceive the possibility of learning. Without the perceptron, it could have taken longer for the scientific community to realize the potential of computer-based automatic learning algorithms.

References

  • Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65(6), 386.
  • Muselli, M. (1997). On convergence properties of the pocket algorithm. IEEE Transactions on Neural Networks, 8(3), 623-629.
lock icon The rest of the chapter is locked
You have been reading a chapter from
Deep Learning for Beginners
Published in: Sep 2020 Publisher: Packt ISBN-13: 9781838640859
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}