Reader small image

You're reading from  Hands-On Neural Networks with TensorFlow 2.0

Product typeBook
Published inSep 2019
Reading LevelExpert
PublisherPackt
ISBN-139781789615555
Edition1st Edition
Languages
Right arrow
Author (1)
Paolo Galeone
Paolo Galeone
author image
Paolo Galeone

Paolo Galeone is a computer engineer with strong practical experience. After getting his MSc degree, he joined the Computer Vision Laboratory at the University of Bologna, Italy, as a research fellow, where he improved his computer vision and machine learning knowledge working on a broad range of research topics. Currently, he leads the Computer Vision and Machine Learning laboratory at ZURU Tech, Italy. In 2019, Google recognized his expertise by awarding him the title of Google Developer Expert (GDE) in Machine Learning. As a GDE, he shares his passion for machine learning and the TensorFlow framework by blogging, speaking at conferences, contributing to open-source projects, and answering questions on Stack Overflow.
Read more about Paolo Galeone

Right arrow

Neural Networks and Deep Learning

Neural networks are the main machine learning models that we will be looking at in this book. Their applications are countless, as are their application fields. These range from computer vision applications (where an object should be localized in an image), to finance (where neural networks are applied to detect frauds), passing trough trading, to reaching even the art field, where neural networks are used together with the adversarial training process to create models that are able to generate new and unseen kinds of art with astonishing results.

This chapter, which is perhaps the richest in terms of theory in this whole book, shows you how to define neural networks and how to make them learn. To begin, the mathematical formula for artificial neurons will be presented, and we will highlight why a neuron must have certain features to be able to...

Neural networks

The definition of a neural network, as provided by the inventor of one of the first neurocomputers, Dr. Robert Hecht-Nielson, in Neural Network Primer—Part I, is as follows:

"A computing system made up of a number of simple, highly interconnected processing elements, which process information by their dynamic state response to external inputs."

In practice, we can think of artificial neural networks as a computational model that is based on how the brain is believed to work. Hence, the mathematical model is inspired by biological neurons.

Biological neurons

The main computational units of the brain are known as neurons; in the human nervous system, approximately 86 billion neurons can be found...

Optimization

Operation research gives us efficient algorithms that we can use to solve optimization problems by finding the global optimum (the global minimum point) if the problems are expressed as a function with well-defined characteristics (for instance, convex optimization requires the function to be a convex).

Artificial neural networks are universal function approximators; therefore, it is not possible to make assumptions about the shape of the function the neural network is approximating. Moreover, the most common optimization methods exploit geometric considerations, but we know from Chapter 1, What is Machine Learning?, that geometry works in an unusual way when dimensionality is high due to the curse of dimensionality.

For these reasons, it is not possible to use operation research methods that are capable of finding the global optimum of an optimization (minimization...

Convolutional neural networks

Convolutional Neural Networks (CNNs) are the fundamental building blocks of modern computer vision, speech recognition, and even natural language processing applications. In this section, we are going to describe the convolution operator, how it is used in the signal analysis domain, and how convolution is used in machine learning.

The convolution operator

Signal theory gives us all the tools we need to properly understand the convolution operation: why it is so widely used in many different domains and why CNNs are so powerful. The convolution operation is used to study the response of certain physical systems when a signal is applied to their input. Different input stimuli can make a system...

Regularization

Regularization is a way to deal with the problem of overfitting: the goal of regularization is to modify the learning algorithm, or the model itself, to make the model perform well—not just on the training data, but also on new inputs.

One of the most widely used solutions to the overfitting problem—and probably one of the most simple to understand and analyze—is known as dropout.

Dropout

The idea of dropout is to train an ensemble of neural networks and average the results instead of training only a single standard network. Dropout builds new neural networks, starting from a standard neural network, by dropping out neurons with probability.

When a neuron is dropped out, its output is set...

Summary

This chapter is probably the most theory intensive of this whole book; however, it is required that you have at least an intuitive idea of the building blocks of neural networks and of the various algorithms that are used in machine learning so that you can start developing a meaningful understanding of what's going on.

We have looked at what a neural network is, what it means to train it, and how to perform a parameter update with some of the most common update strategies. You should now have a basic understanding of how the chain rule can be applied in order to compute the gradient of a function efficiently.

We haven't explicitly talked about deep learning, but in practice, that is what we did; keep in mind that stacking layers of neural networks is like stacking different classifiers that combine their expressive power. We indicated this with the term deep...

Exercises

This chapter was filled with various theoretical concepts to understand so, just like the previous chapter, don't skip the exercises:

  1. What are the similarities between artificial and biological neurons?
  2. Does the neuron's topology change the neural network's behavior?
  3. Why do neurons require a non-linear activation function?
  4. If the activation function is linear, a multi-layer neural network is the same as a single layer neural network. Why?
  5. How is an error in input data treated by a neural network?
  6. Write the mathematical formulation of a generic neuron.
  7. Write the mathematical formulation of a fully connected layer.
  8. Why can a multi-layer configuration solve problems with non-linearly separable solutions?
  9. Draw the graph of the sigmoid, tanh, and ReLu activation functions.
  10. Is it always required to format training set labels into a one-hot encoded representation...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Neural Networks with TensorFlow 2.0
Published in: Sep 2019Publisher: PacktISBN-13: 9781789615555
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Paolo Galeone

Paolo Galeone is a computer engineer with strong practical experience. After getting his MSc degree, he joined the Computer Vision Laboratory at the University of Bologna, Italy, as a research fellow, where he improved his computer vision and machine learning knowledge working on a broad range of research topics. Currently, he leads the Computer Vision and Machine Learning laboratory at ZURU Tech, Italy. In 2019, Google recognized his expertise by awarding him the title of Google Developer Expert (GDE) in Machine Learning. As a GDE, he shares his passion for machine learning and the TensorFlow framework by blogging, speaking at conferences, contributing to open-source projects, and answering questions on Stack Overflow.
Read more about Paolo Galeone