Reader small image

You're reading from  The Deep Learning Architect's Handbook

Product typeBook
Published inDec 2023
PublisherPackt
ISBN-139781803243795
Edition1st Edition
Right arrow
Author (1)
Ee Kin Chin
Ee Kin Chin
author image
Ee Kin Chin

Ee Kin Chin is a Senior Deep Learning Engineer at DataRobot. He holds a Bachelor of Engineering (Honours) in Electronics with a major in Telecommunications. Ee Kin is an expert in the field of Deep Learning, Data Science, Machine Learning, Artificial Intelligence, Supervised Learning, Unsupervised Learning, Python, Keras, Pytorch, and related technologies. He has a proven track record of delivering successful projects in these areas and is dedicated to staying up to date with the latest advancements in the field.
Read more about Ee Kin Chin

Right arrow

Designing Deep Learning Architectures

In the previous chapter, we went through the entire deep learning life cycle and understood what it means to make a deep learning project successful from end to end. With that knowledge, we are now ready to dive further into the technicalities of deep learning models. In this chapter, we will dive into common deep learning architectures used in the industry and understand the reasons behind each architecture’s design. For intermediate and advanced readers, this will be a brief recap to ensure alignment in the definitions of terms. For beginner readers, architectures will be presented in a way that is easy to digest so that you can get up to speed on the useful neural architectures in the world of deep learning.

Grasping the methodologies behind a wide variety of architectures allows you to innovate custom architectures specific to your use case and, most importantly, gain the skill to choose an appropriate foundational architecture based...

Technical requirements

This chapter includes some practical implementations in the Python programming language. To complete it, you will need to have a computer with the following libraries installed:

  • pandas
  • Matplotlib
  • Seaborn
  • Scikit-learn
  • NumPy
  • Keras
  • PyTorch

The code files are available on GitHub: https://github.com/PacktPublishing/The-Deep-Learning-Architect-Handbook/tree/main/CHAPTER_2.

Exploring the foundations of neural networks using an MLP

A deep learning architecture is created when at least three perceptron layers are used, excluding the input layer. A perceptron is a single-layer network consisting of neuron units. Neuron units hold a bias variable and act as nodes for vertices to be connected. These neurons will interact with other neurons in a separate layer with weights applied to the connections/vertices between neurons. A perceptron is also known as a fully connected layer or dense layer, and MLPs are also known as feedforward neural networks or fully connected neural networks.

Let’s refer back to the MLP figure from the previous chapter to get a better idea.

Figure 2.1 – Simple deep learning architecture, also called an MLP

Figure 2.1 – Simple deep learning architecture, also called an MLP

The figure shows how three data column inputs get passed into the input layer, then subsequently get propagated to the hidden layer, and finally, through the output layer. Although not...

Understanding neural network gradients

The goal of machine learning for an MLP is to find the weights and biases that will effectively map the inputs to the desired outputs. The weights and biases generally get initialized randomly. In the training process, with a provided dataset, they get updated iteratively and objectively in batches to minimize the loss function, which uses gradients computed with a method called backward propagation, also known as backpropagation. A batch is a subset of the dataset used for training or evaluation, allowing the neural network to process the data in smaller groups rather than the entire dataset at once. The loss function is also known as the error function or the cost function.

Backpropagation is a technique to find out how sensitive a change of weights and bias of every neuron is to the overall loss by using the partial derivative of the loss with respect to the weights and biases. Partial derivatives from calculus are a measure of the rate...

Understanding gradient descent

A good way to think about loss for a deep learning model is that it exists in a three-dimensional loss landscape that has many different hills and valleys, with valleys being more optimal, as shown in Figure 2.4.

Figure 2.4 – An example loss landscape

Figure 2.4 – An example loss landscape

In reality, however, we can only approximate these loss landscapes as the parameter values of the neural networks can exist in an infinite number of ways. The most common way practitioners use to monitor the behavior of loss during each epoch of training and validation is to simply plot a two-dimensional line graph with the x axis being the epochs executed and the y axis being the loss performance. An epoch is a single iteration through the entire dataset during the training process of a neural network. The loss landscape in Figure 2.4 is an approximation of the loss landscape in three dimensions of a neural network. To visualize the three-dimensional loss landscape in...

Implementing an MLP from scratch

Today, the process to create a neural network and its layers along with the backpropagation process has been encapsulated in deep learning frameworks. The differentiation process has been automated, where there is no actual need to define the derivative formulas manually. Removing the abstraction layer provided by the deep learning libraries will help to solidify your understanding of neural network internals. So, let’s create this neural network manually and explicitly with the logic to forward pass and backward pass instead of using the deep learning libraries:

  1. We’ll start by importing numpy and the methods from the scikit-learn library to load sample datasets and perform data partitioning:
    import numpy as np
    from sklearn import datasets
    from sklearn.model_selection import train_test_split
  2. Next, we define ReLU, the method that makes an MLP non-linear:
    def ReLU(x):
      return np.maximum(x, 0)
  3. Now, let’s define...

Summary

MLPs are the foundational piece of architecture in deep learning that transcends just processing tabular data and is more than an old architecture that got superseded. MLPs are very commonly utilized as a sub-component in many advanced neural network architectures today to either provide more automatic feature engineering, reduce the dimensionality of large features, or shape the features into the desired shapes for target predictions. Look out for MLPs or, more importantly, the fully connected layer, in the next few architectures that are going to be introduced in the next few chapters!

The automatic gradient computation provided by deep learning frameworks simplifies the implementation of backpropagation and allows us to focus on designing new neural networks. It is essential to ensure that the mathematical functions used in these networks are differentiable, although this is often taken care of when adopting successful research findings. And that’s the beauty of...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
The Deep Learning Architect's Handbook
Published in: Dec 2023Publisher: PacktISBN-13: 9781803243795
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Ee Kin Chin

Ee Kin Chin is a Senior Deep Learning Engineer at DataRobot. He holds a Bachelor of Engineering (Honours) in Electronics with a major in Telecommunications. Ee Kin is an expert in the field of Deep Learning, Data Science, Machine Learning, Artificial Intelligence, Supervised Learning, Unsupervised Learning, Python, Keras, Pytorch, and related technologies. He has a proven track record of delivering successful projects in these areas and is dedicated to staying up to date with the latest advancements in the field.
Read more about Ee Kin Chin