Reader small image

You're reading from  Machine Learning with PyTorch and Scikit-Learn

Product typeBook
Published inFeb 2022
PublisherPackt
ISBN-139781801819312
Edition1st Edition
Right arrow
Authors (3):
Sebastian Raschka
Sebastian Raschka
author image
Sebastian Raschka

Sebastian Raschka is an Assistant Professor of Statistics at the University of Wisconsin-Madison focusing on machine learning and deep learning research. As Lead AI Educator at Grid AI, Sebastian plans to continue following his passion for helping people get into machine learning and artificial intelligence.
Read more about Sebastian Raschka

Yuxi (Hayden) Liu
Yuxi (Hayden) Liu
author image
Yuxi (Hayden) Liu

Yuxi (Hayden) Liu was a Machine Learning Software Engineer at Google. With a wealth of experience from his tenure as a machine learning scientist, he has applied his expertise across data-driven domains and applied his ML expertise in computational advertising, cybersecurity, and information retrieval. He is the author of a series of influential machine learning books and an education enthusiast. His debut book, also the first edition of Python Machine Learning by Example, ranked the #1 bestseller in Amazon and has been translated into many different languages.
Read more about Yuxi (Hayden) Liu

Vahid Mirjalili
Vahid Mirjalili
author image
Vahid Mirjalili

Vahid Mirjalili is a deep learning researcher focusing on CV applications. Vahid received a Ph.D. degree in both Mechanical Engineering and Computer Science from Michigan State University.
Read more about Vahid Mirjalili

View More author details
Right arrow

Implementing a Multilayer Artificial Neural Network from Scratch

As you may know, deep learning is getting a lot of attention from the press and is, without doubt, the hottest topic in the machine learning field. Deep learning can be understood as a subfield of machine learning that is concerned with training artificial neural networks (NNs) with many layers efficiently. In this chapter, you will learn the basic concepts of artificial NNs so that you are well equipped for the following chapters, which will introduce advanced Python-based deep learning libraries and deep neural network (DNN) architectures that are particularly well suited for image and text analyses.

The topics that we will cover in this chapter are as follows:

  • Gaining a conceptual understanding of multilayer NNs
  • Implementing the fundamental backpropagation algorithm for NN training from scratch
  • Training a basic multilayer NN for image classification

Modeling complex functions with artificial neural networks

At the beginning of this book, we started our journey through machine learning algorithms with artificial neurons in Chapter 2, Training Simple Machine Learning Algorithms for Classification. Artificial neurons represent the building blocks of the multilayer artificial NNs that we will discuss in this chapter.

The basic concept behind artificial NNs was built upon hypotheses and models of how the human brain works to solve complex problem tasks. Although artificial NNs have gained a lot of popularity in recent years, early studies of NNs go back to the 1940s, when Warren McCulloch and Walter Pitts first described how neurons could work. (A logical calculus of the ideas immanent in nervous activity, by W. S. McCulloch and W. Pitts, The Bulletin of Mathematical Biophysics, 5(4):115–133, 1943.)

However, in the decades that followed the first implementation of the McCulloch-Pitts neuron model—Rosenblatt&...

Classifying handwritten digits

In the previous section, we covered a lot of the theory around NNs, which can be a little bit overwhelming if you are new to this topic. Before we continue with the discussion of the algorithm for learning the weights of the MLP model, backpropagation, let’s take a short break from the theory and see an NN in action.

Additional resources on backpropagation

The NN theory can be quite complex; thus, we want to provide readers with additional resources that cover some of the topics we discuss in this chapter in more detail or from a different perspective:

  • Chapter 6, Deep Feedforward Networks, Deep Learning, by I. Goodfellow, Y. Bengio, and A. Courville, MIT Press, 2016 (manuscripts freely accessible at http://www.deeplearningbook.org).
  • Pattern Recognition and Machine Learning, by C. M. Bishop, Springer New York, 2006.
  • Lecture video slides from Sebastian Raschka’s deep learning course:

    https:/...

Training an artificial neural network

Now that we have seen an NN in action and have gained a basic understanding of how it works by looking over the code, let’s dig a little bit deeper into some of the concepts, such as the loss computation and the backpropagation algorithm that we implemented to learn the model parameters.

Computing the loss function

As mentioned previously, we used an MSE loss (as in Adaline) to train the multilayer NN as it makes the derivation of the gradients a bit easier to follow. In later chapters, we will discuss other loss functions, such as the multi-category cross-entropy loss (a generalization of the binary logistic regression loss), which is a more common choice for training NN classifiers.

In the previous section, we implemented an MLP for multiclass classification that returns an output vector of t elements that we need to compare to the t×1 dimensional target vector in the one-hot encoding representation. If we predict the...

About convergence in neural networks

You might be wondering why we did not use regular gradient descent but instead used mini-batch learning to train our NN for the handwritten digit classification earlier. You may recall our discussion on SGD that we used to implement online learning. In online learning, we compute the gradient based on a single training example (k = 1) at a time to perform the weight update. Although this is a stochastic approach, it often leads to very accurate solutions with a much faster convergence than regular gradient descent. Mini-batch learning is a special form of SGD where we compute the gradient based on a subset k of the n training examples with 1 < k < n. Mini-batch learning has an advantage over online learning in that we can make use of our vectorized implementations to improve computational efficiency. However, we can update the weights much faster than in regular gradient descent. Intuitively, you can think of...

A few last words about the neural network implementation

You may be wondering why we went through all of this theory just to implement a simple multilayer artificial network that can classify handwritten digits instead of using an open source Python machine learning library. In fact, we will introduce more complex NN models in the next chapters, which we will train using the open source PyTorch library (https://pytorch.org).

Although the from-scratch implementation in this chapter seems a bit tedious at first, it was a good exercise for understanding the basics behind backpropagation and NN training. A basic understanding of algorithms is crucial for applying machine learning techniques appropriately and successfully.

Now that you have learned how feedforward NNs work, we are ready to explore more sophisticated DNNs using PyTorch, which allows us to construct NNs more efficiently, as we will see in Chapter 12, Parallelizing Neural Network Training with PyTorch.

PyTorch...

Summary

In this chapter, you have learned the basic concepts behind multilayer artificial NNs, which are currently the hottest topic in machine learning research. In Chapter 2, Training Simple Machine Learning Algorithms for Classification, we started our journey with simple single-layer NN structures and now we have connected multiple neurons to a powerful NN architecture to solve complex problems such as handwritten digit recognition. We demystified the popular backpropagation algorithm, which is one of the building blocks of many NN models that are used in deep learning. After learning about the backpropagation algorithm in this chapter, we are well equipped for exploring more complex DNN architectures. In the remaining chapters, we will cover more advanced deep learning concepts and PyTorch, an open source library that allows us to implement and train multilayer NNs more efficiently.

Join our book’s Discord space

Join our Discord community to meet like-minded people...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Machine Learning with PyTorch and Scikit-Learn
Published in: Feb 2022Publisher: PacktISBN-13: 9781801819312
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at AU $19.99/month. Cancel anytime

Authors (3)

author image
Sebastian Raschka

Sebastian Raschka is an Assistant Professor of Statistics at the University of Wisconsin-Madison focusing on machine learning and deep learning research. As Lead AI Educator at Grid AI, Sebastian plans to continue following his passion for helping people get into machine learning and artificial intelligence.
Read more about Sebastian Raschka

author image
Yuxi (Hayden) Liu

Yuxi (Hayden) Liu was a Machine Learning Software Engineer at Google. With a wealth of experience from his tenure as a machine learning scientist, he has applied his expertise across data-driven domains and applied his ML expertise in computational advertising, cybersecurity, and information retrieval. He is the author of a series of influential machine learning books and an education enthusiast. His debut book, also the first edition of Python Machine Learning by Example, ranked the #1 bestseller in Amazon and has been translated into many different languages.
Read more about Yuxi (Hayden) Liu

author image
Vahid Mirjalili

Vahid Mirjalili is a deep learning researcher focusing on CV applications. Vahid received a Ph.D. degree in both Mechanical Engineering and Computer Science from Michigan State University.
Read more about Vahid Mirjalili