You're reading from Deep Learning for Beginners

Product typeBook

Published inSep 2020

Reading LevelBeginner

PublisherPackt

ISBN-139781838640859

Edition1st Edition

Languages

Python

Tools

Keras

Concepts

Deep Learning

Author (1)

Dr. Pablo Rivas

Training Multiple Layers of Neurons

Previously, in Chapter 6, Training a Single Neuron, we explored a model involving a single neuron and the concept of the perceptron. A limitation of the perceptron model is that, at best, it can only produce linear solutions on a multi-dimensional hyperplane. However, this limitation can be easily solved by using multiple neurons and multiple layers of neurons in order to produce highly complex non-linear solutions for separable and non-separable problems. This chapter introduces you to the first challenges of deep learning using the Multi-Layer Perceptron (MLP) algorithm, such as a gradient descent technique for error minimization, followed by hyperparameter optimization experiments to determine trustworthy accuracy.

The following topics will be covered in this chapter:

The MLP model
Minimizing the error
Finding the best hyperparameters

...

The MLP model

We have previously seen, in Chapter 5, Training a Single Neuron, that Rosenblatt's perceptron model is simple and powerful for some problems (Rosenblatt, F. 1958). However, for more complicated and highly non-linear problems, Rosenblatt did not give enough attention to his models that connected many more neurons in different architectures, including deeper models (Tappert, C. 2019).

Years later, in the 1990s, Prof. Geoffrey Hinton, the 2019 Turing Award winner, continued working to connect more neurons together since this is more brain-like than simple neurons (Hinton, G. 1990). Most people today know this type of approach as connectionist. The main idea is to connect neurons in different ways that will resemble brain connections. One of the first successful models was the MLP, which uses a supervised gradient descent-based learning algorithm that learns to approximate a function, , using labeled data, .

Figure 6.1 depicts an MLP with one layer of multiple neurons...

Minimizing the error

Learning from data using an MLP was one of the major problems since its conception. As we pointed out before, one of the major problems with neural networks was the computational tractability of deeper models, and the other was stable learning algorithms that would converge to a reasonable minimum. One of the major breakthroughs in machine learning, and what paved the way for deep learning, was the development of the learning algorithm based on backpropagation. Many scientists independently derived and applied forms of backpropagation in the 1960s; however, most of the credit has been given to Prof. G. E. Hinton and his group (Rumelhart, D. E., et.al. 1986). In the next few paragraphs, we will go over this algorithm, whose sole purpose is to minimize the error caused by incorrect predictions made during training.

To begin, we will describe the dataset, which is called spirals. This is a widely known benchmark dataset that has two classes that are separable, yet highly...

Finding the best hyperparameters

There is a simpler way of coding what we coded in the previous section using Keras. We can rely on the fact that the backprop is coded correctly and is improved for stability and there is a richer set of other features and algorithms that can improve the learning process. Before we begin the process of optimizing the set of hyperparameters of the MLP, we should indicate what would be the equivalent implementation using Keras. The following code should reproduce the same model, almost the same loss function, and almost the same backprop methodology:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

mlp = Sequential()
mlp.add(Dense(3, input_dim=2, activation='sigmoid'))
mlp.add(Dense(2, activation='sigmoid'))

mlp.compile(loss='mean_squared_error',
            optimizer='sgd',
            metrics=['accuracy'])

# This assumes that you still have X, y from earlier
# when we called...

Summary

This intermediate-introductory chapter showed the design of an MLP and the paradigms surrounding its functionality. We covered the theoretical framework behind its elements and we had a full discussion and treatment of the widely known backprop mechanism to perform gradient descent on a loss function. Understanding the backprop algorithm is key for further chapters since some models are designed specifically to overcome some potential difficulties with backprop. You should feel confident that what you have learned about backprop will serve you well in knowing what deep learning is all about. This backprop algorithm, among other things, is what makes deep learning an exciting area. Now, you should be able to understand and design your own MLP with different layers and different neurons. Furthermore, you should feel confident in changing some of its parameters, although we will cover more of this in the further reading.

Chapter 7, Autoencoders, will continue with an architecture...

Questions and answers

Why is the MLP better than the perceptron model?

The larger number and layers of neurons give the MLP the advantage over the perceptron to model non-linear problems and solve much more complicated pattern recognition problems.

Why is backpropagation so important to know about?

Because it is what makes neural networks learn in the era of big data.

Does the MLP always converge?

Yes and no. It does always converge to a local minimum in terms of the loss function; however, it is not guaranteed to converge to a global minimum since, usually, most loss functions are non-convex and non-smooth.

Why should we try to optimize the hyperparameters of our models?

Because anyone can train a simple neural network; however, not everyone knows what things to change to make it better. The success of your model depends heavily on you trying different things and proving to yourself (and others) that your model is the best that it can be. This is what will make you a better...

References

Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386.
Tappert, C. C. (2019). Who is the Father of Deep Learning? Symposium on Artificial Intelligence.
Hinton, G. E. (1990). Connectionist learning procedures. Machine learning. Morgan Kaufmann, 555-610.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533-536.
Florez, O. U. (2017). One LEGO at a time: Explaining the Math of How Neural Networks Learn. Online: https://omar-florez.github.io/scratch_mlp/.
Amari, S. I. (1993). Backpropagation and stochastic gradient descent method. Neurocomputing, 5(4-5), 185-196.

The rest of the chapter is locked

You have been reading a chapter from

Deep Learning for Beginners

Published in: Sep 2020Publisher: PacktISBN-13: 9781838640859

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Dr. Pablo Rivas

Dr. Pablo Rivas is an assistant professor of computer science at Baylor University in Texas. He worked in industry for a decade as a software engineer before becoming an academic. He is a senior member of the IEEE, ACM, and SIAM. He was formerly at NASA Goddard Space Flight Center performing research. He is an ally of women in technology, a deep learning evangelist, machine learning ethicist, and a proponent of the democratization of machine learning and artificial intelligence in general. He teaches machine learning and deep learning. Dr. Rivas is a published author and all his papers are related to machine learning, computer vision, and machine learning ethics. Dr. Rivas prefers Vim to Emacs and spaces to tabs.
Read more about Dr. Pablo Rivas

Other recommended products

Related to this chapter

Machine Learning for Healthcare Analytics Projects

Machine Learning in the healthcare domain is booming because of its abilities to provide accurate and stabilized techniques. This book is packed with new methodologies to create efficient solutions for healthcare analytics. We will build five end-to-end projects to evaluate the efficiency of AI apps to carry out simple-to-complex healthcare analytics tasks.

BookOct 2018134 pages

Deep Learning with Hadoop

BookFeb 2017206 pages

Hands-On Deep Learning Algorithms with Python

This book introduces basic-to-advanced deep learning algorithms used in a production environment by AI researchers and principal data scientists; it explains algorithms intuitively, including the underlying math, and shows how to implement them using popular Python-based deep learning libraries such as TensorFlow.

BookJul 2019512 pages

Keras Deep Learning Cookbook

This book gives you a practical, hands-on understanding of how you can leverage the power of Python and Keras to perform effective deep learning. It presents a unique problem-solution approach to tackle various problems in training different types of neural networks while taking care of the speed and accuracy of these models

BookOct 2018252 pages

Python Deep Learning Cookbook

Deep Learning is a rapidly evolving field of Machine Learning science which gives machines the ability to learn from information. This book contains detailed recipes to tackle with the common and not so common problems while dealing with deep learning algorithms and models in Python. You will benefit from this book by finding technical solutions to the issues presented, along with a detailed explanation of the solutions, and a discussion on corresponding pros and cons of implementing the proposed solution using Theano, Tensorflow, MXNet, and Keras. You'll come across recipes on data pre-processing, network models and topologies, supervised and unsupervised learning presented in a “solution to problem” fashion.

BookOct 2017330 pages

TensorFlow 2.0 Computer Vision Cookbook

This book covers recipes for solving various computer vision tasks using TensorFlow, taking you through all the tips and tricks you need to overcome any challenges that you may face while building various computer vision applications. You will discover machine learning techniques to solve problems in image processing, feature extraction, and more.

BookFeb 2021542 pages

Neural Network Projects with Python

This book contains practical implementations of several deep learning projects in multiple domains, including in regression-based tasks such as taxi fare prediction in New York City, image classification of cats and dogs using a convolutional neural network, implementing a facial recognition security system using Siamese Neural Networks, and more.

BookFeb 2019308 pages

Advanced Deep Learning with TensorFlow 2 and Keras

A second edition of the bestselling guide to exploring and mastering deep learning with Keras, updated to include TensorFlow 2.x with new chapters on object detection, semantic segmentation, and unsupervised learning using mutual information.

BookFeb 2020512 pages

Advanced Deep Learning with R

This book will help readers to apply deep learning algorithms in R using advanced examples. You will cover variants of neural network models such as ANN, CNN, RNN, LSTM, and more using expert techniques. Readers will make use of popular deep learning libraries such as Keras-R, Tensorflow-R, and more to implement AI models.

BookDec 2019352 pages

Advanced Deep Learning with Keras

This book covers advanced deep learning techniques to create successful AI. Using MLPs, CNNs, and RNNs as building blocks to more advanced techniques, you’ll study deep neural network architectures, Autoencoders, Generative Adversarial Networks (GANs), Variational AutoEncoders (VAEs), and Deep Reinforcement Learning (DRL) critical to many cutting-edge AI results.

BookOct 2018368 pages

Deep Learning with Keras

Keras is a high-level neural network library written in Python that runs on top of either Theano or TensorFlow. With this book, you’ll learn the basics of Keras in a highly practical way and understand how this minimal, highly modular framework runs on both CPU and GPU, allowing you to put your ideas into action in the shortest possible time.

BookApr 2017318 pages

Hands-On Computer Vision with TensorFlow 2

Computer vision is achieving a new frontier of capabilities in fields like health, automobile or robotics. This book explores TensorFlow 2, Google's open-source AI framework, and teaches how to leverage deep neural networks for visual tasks. It will help you acquire the insight and skills to be a part of the exciting advances in computer vision.

BookMay 2019372 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages